news

2024 Open Compute China Summit: Openness accelerates AI development, open computing module specification launched

2024-08-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Chinanews.com, Beijing, August 12 (Yuan Jiawei and Xia Bin) The 2024 Open Compute China Summit was held in Beijing recently. How open computing can accelerate the development of artificial intelligence has become the focus of the conference. At the meeting, the "Open Compute Module (OCM)" specification was officially launched. The first batch of members included institutions and companies such as China Electronics Standardization Institute, Baidu, Xiaohongshu, Inspur Information, Lenovo, Super Fusion, Intel, and AMD. This is the first server computing module design specification in China. The upstream and downstream industries hope to jointly establish standardized computing module units, build an open cooperation, integrated innovation industry ecosystem, and stimulate the innovation and development of artificial intelligence technology.

Image caption: The "Open Computing Module (OCM)" specification is officially launched. Photo: provided by the organizer.

The summit was co-organized by the open computing community OCP and the open standards organization OCTC (Open Computing Standards Working Committee of the China Electronics Standardization Technical Association), with the theme of "Open Collaboration: Collaboration, Intelligence, and Innovation", focusing on topics such as data center infrastructure, artificial intelligence innovation, open computing ecology, green computing development, open systems & CXL, etc. Companies including Baidu, Alibaba Cloud, Industrial and Commercial Bank of China, ByteDance, Samsung, Inspur Information, NVIDIA, Flex, Solidigm, Intel, and 21Vianet, as well as more than a thousand IT engineers and data center practitioners participated in the conference.

The rapid development of generative artificial intelligence has brought about more abundant intelligent application scenarios. The prosperity of intelligent applications will inevitably require more computing power to support reasoning. General computing power, as a more common and easier to obtain computing power, will obviously greatly accelerate the process of intelligence once it has the ability of AI computing.

Zhao Shuai, general manager of Inspur Information Server Product Line, said: "Not only AI chips, but all computing is AI. General computing power must also have AI computing capabilities. However, the iteration of CPU processors is also very fast, and the technical routes and requirements of different platforms are different. More than ten chips may require the development of hundreds of servers."

However, the CPU protocol standards of different architectures such as x86, ARM, and RISC-V are not unified, which leads to a huge amount of time-consuming hardware development, firmware adaptation, component testing, etc. At the same time, in order to better adapt to the highly parallel computing characteristics of AI reasoning, the CPU bus interconnection bandwidth, memory bandwidth, and capacity also need to be specially optimized, which makes the system power consumption, bus rate, and current density continue to increase... With the superposition of multiple factors, the design and development cycle of the computing system is long and costly.

As CPUs develop in a diversified way, how to quickly innovate from CPUs to computing systems so that they can be suitable for AI inference loads has become a key link in alleviating the current shortage of AI computing power and promoting the development of artificial intelligence.

To this end, the Open Computing Module (OCM) specification was officially launched at the meeting, aiming to build a minimum computing unit with CPU and memory as the core, compatible with multiple generations of processors of multi-architecture chips such as x86 and ARM, so that users can flexibly and quickly combine them according to application scenarios.

The launch of the OCM specification aims to establish a standardized computing power module unit based on the processor, unify the external high-speed interconnection, management protocol, power supply interface, etc. of different processor computing power units, achieve compatibility of processor chips with different architectures, and build a unified computing power base for the CPU to solve the CPU ecological challenges, so that customers can flexibly and quickly match the most suitable computing power platform according to diversified application scenarios such as artificial intelligence, cloud computing, and big data, and promote the high-quality and rapid development of the computing power industry. The formulation of the OCM open standard can provide users with more versatile, green, efficient, safe and reliable computing power options.

In addition, generative artificial intelligence is reshaping the data center infrastructure, placing higher demands on computing performance, storage capacity and performance, network solutions, resource scheduling management, energy efficiency control and management, and omnidirectional Scale (performance enhancement and scale expansion) capabilities have become the core of building advanced AI infrastructure. At this summit, a large number of innovative technologies and product solutions, including CXL technology, AI-oriented network architecture, and the first 16-channel PCIe5.0 TLC solid-state drive, will further enhance the Scale capabilities of data centers.

Zhao Shuai believes that open computing has a very important significance and value in the era of intelligent computing. We must use openness to cope with the challenges of diversified computing power, and also use openness to promote the scale of current computing power. Computing power scale is a process of rapid development in which scale up (single system performance improvement) and scale out (cluster scale expansion) coexist and iterate. At this stage, open acceleration modules and open networks have achieved the scale of computing power, open firmware solutions have achieved the scale of management, and open standards and open ecosystems have achieved the scale of infrastructure. In the future, we must use open innovation to accelerate the omnidirectional scale of computing power systems and respond to large model scaling laws.

The conference also released the top ten innovative achievements in open computing, including guidelines for the deployment of hyperscale data centers and technical requirements for the design of liquid-cooled artificial intelligence accelerator cards, further demonstrating the innovative vitality of open computing in the data center field.

In the era of intelligence, big models are reshaping AI infrastructure. Data centers are facing all-round scale innovation challenges in computing power, networking, storage, management, and energy efficiency. It is necessary to build a global open collaboration platform to jointly solve the above major problems and give unlimited possibilities to the development of AI through comprehensive optimization of artificial intelligence infrastructure. (End)

【Editor: Cao Zijian】
Report/Feedback