"the third main chip" dpu: the next three years is the window period for commercialization

2024-09-11

as the third main chip in data centers after cpu and gpu, dpu has continued to gain popularity in recent years.

dpu, or data processing unit, has powerful network processing capabilities, as well as security, storage and network offloading functions. it can release cpu computing power and complete data processing tasks such as network protocol processing, data encryption and decryption, data compression, etc. that the cpu is not good at. it can also manage, expand and schedule various resources separately, that is, handle tasks that "the cpu cannot do well and the gpu cannot do", thereby reducing costs and improving efficiency in data centers.

in the ai era, the amount of data that intelligent computing centers need to process has exploded. dpu can unleash the effective computing power of intelligent computing centers and solve the problem of reducing costs and increasing efficiency of infrastructure. its importance and penetration are gradually increasing.

three u in one, a solution more suitable for the era of intelligent computing

"the concept of dpu was hyped up by nvidia four years ago. after acquiring the israeli company mellanox, nvidia became the industry's first supplier of complete data center solutions that include cpu, gpu, and dpu." lu sheng, founder of xinqiyuan, said in an exclusive interview with caixin that xinqiyuan is one of the earliest manufacturers in china to engage in dpu research and development, dating back to 2018, when it was still called smartnic.

"in the past, traditional network cards were used to carry network transmission functions. later, smart network cards were born, and four years ago they gradually evolved into dpus." zhang yu, senior vice president of beijing zhongke yusu technology co., ltd., which focuses on the research and development and design of intelligent computing chips, told caixin.

in 2020, nvidia released its dpu product strategy, positioning it as the "third main chip" in the data center after the cpu and gpu, thus igniting the concept of dpu.

today, dpu has become an emerging special-purpose processor in data centers, specially designed to accelerate security, network and storage tasks in data centers, and provide power for high-bandwidth, low-latency data-intensive computing scenarios. the core role of dpu is to take over network, storage, security and management tasks originally handled by cpu, thereby freeing up cpu resources and strengthening data security and privacy protection.

"the solutions for intelligent computing centers developed by nvidia are actually three-u-in-one. nvidia's dgx a100 server three years ago and the subsequent dgx gh200 series all include cpu, gpu and dpu. of course, there are also smart network cards like rdma in the dpu, which can actually be classified as dpu. they are essentially the same thing. so from this perspective, the current industry leader, or the generally recognized direction, is the collaboration of cpu, gpu and dpu in the intelligent computing center." zhang yu said that the solution for general data centers is more of cpu, storage and network. in some cloud-native scenarios, low-latency and high-throughput data network processing is also a rigid demand, and intelligent computing scenarios have higher requirements for network processing performance.

"if the cpu is likened to the brain, which is used for overall control, then the gpu is more like muscles, used to provide solid and abundant parallel computing power, and the dpu is more like blood vessels and nerves, transporting the data that the gpu needs to calculate to the server through the dpu, completing the control instruction exchange and protocol conversion." zhang yu said.

"the coordination of multiple pus is actually an upgrade of the overall computing architecture, from the past architecture dominated by general-purpose cpus to a computing architecture dominated by accelerators, and the cost-effectiveness of the overall computing solution is improved through the coordination of cpu, gpu, dpu, npu, etc." zhang yu said, "currently, in terms of technology, dpu has gradually matured and its boundaries are relatively mature. network security encryption and decryption, zero trust, and network offload have basically become functions that dpu can stably carry."

reduce capex investment and energy consumption, with certain cost-effectiveness

as the cpu offloading engine, the most direct function of dpu is to take over infrastructure layer services such as network virtualization and hardware resource pooling, releasing the cpu's computing power to upper-level applications. therefore, it can effectively release the computing power of the intelligent computing center and improve energy efficiency.

"nvidia previously admitted that the efficiency of the computing power chips of its previous generation of generative ai servers was only 40% of its designed capacity. we measured it to be only more than 30%, which means that most of the computing power is idle. the main reason is that the intermediate variables generated by the calculations are waiting for data synchronization to be completed between clusters. the capacity of the network path limits the upper limit of the computing power base, and this is precisely the real value of the dpu." lu sheng said that this has pushed the dpu to the forefront again.

in the ai era with explosive data volume, dpu can not only help build a new computing power base with low latency, large bandwidth, and high-speed data channels, but also safely and efficiently schedule, manage, and connect these distributed cpu and gpu resources, thereby releasing the effective computing power of the intelligent computing center. therefore, the deployment of dpu can reduce the one-time capex (capital expenditure) investment of the data center. according to cisco, through virtualization technology, enterprises can reduce the number of servers by up to 40% while improving resource utilization.

on the other hand, dpus improve data center energy efficiency through dedicated hardware acceleration of network, security, and storage tasks.

lu sheng introduced that, taking the application scenario of china mobile's sd-wan in zhejiang province as an example, "the integrated hardware and software solution created by the coreqiyuan dpu network card has realized the offloading of network security services. compared with the traditional pure software sd-wan network solution, the efficiency of a single machine has increased by 6-8 times. the overall project has also saved 80% of the server deployment investment and annual software costs, greatly reducing capex investment; in addition, due to the reduction in machine deployment, the energy consumption of the data center has been reduced. it is estimated that more than 3 million kwh of electricity can be saved each year, and the operating costs of the data center have been greatly reduced."

in terms of cost, china business news learned that the r&d and production costs of dpu are relatively high, especially when using advanced processes, so the price is relatively high. however, since the deployment of dpu solutions can not only reduce the number of server devices, but also save energy in subsequent computing processes, the overall system cost still has a certain cost-effectiveness, but it also needs to be discussed based on specific scenarios and applications.

the next three years will be a critical period for commercialization

however, the current dpu penetration rate increase still faces resistance.

a person from zhongke chuangxing, a venture capital firm focusing on early-stage investments in the field of hard technology, told caixin that as a virtualization architecture that collaborates software and hardware, dpu needs to be effectively connected with the virtualization software stack running in the cpu. at the same time, the hardware design of dpu must take into account compatibility and integration with existing systems. secondly, the architecture and interface of dpu have not yet formed a unified standard, and there are differences between products from different manufacturers, which brings challenges to users when using, maintaining and upgrading. in addition, the software ecosystem is not yet mature and lacks complete development tools, drivers and operating system support, "but some companies are already doing it."

lu sheng said that dpu requires a dedicated and efficient instruction set, which is also its core competitiveness. the remaining two-thirds of the work is to build an ecosystem around the instruction set. ecological construction is the core barrier of the dpu industry. the maturity of ecological construction determines the speed of product commercialization.

overall, the dpu industry is still dominated by foreign companies, with the three giants nvidia, broadcom and intel accounting for a high share, and technology companies such as amazon and microsoft are also following suit. domestically, large companies such as china mobile and alibaba are also developing dedicated dpus, and start-ups such as xinqiyuan, zhongke yusu, and dayu zhixin have also achieved corresponding results or progress.

"the development of dpu technology at home and abroad is at the same stage, but foreign companies have deeper accumulation. in my opinion, the dpu industry has actually gradually moved towards a stage of maturity and rapid implementation. foreign countries may be able to move earlier and faster than domestic companies," said zhang yu.

in terms of dpu commercialization, only large cloud vendors such as huawei, alibaba, and zte, as well as a few new dpu players such as xinqiyuan and zhongke yushu, have achieved commercialization in china. the china academy of information and communications technology predicts that the dpu penetration rate in my country's data centers will reach 12.7% in 2025.

zhang yu believes that at the current stage, what is more important for dpu is its deep integration with the cloud at the iaas layer, especially how to provide customers with comprehensive, convenient and transparent pure software iaas solutions, so that they can smoothly migrate to the dpu to support this high energy-efficiency cloud solution.

"this migration requires the joint efforts of the industry and will take a long time, even years," said zhang yu. "amazon cloud is moving faster. they have strong r&d capabilities and have completed the conversion of iaas on dpu. however, for most domestic companies, the pace will not be too big. they may start with the most painful points and use them transparently, such as ovs offloading and network upgrades."

"the commercialization of dpu not only relies on the iaas field of traditional data centers, but also includes many industries and fields such as network security, high-performance storage, and cluster communications." lu sheng said that xinqiyuan has been deeply engaged in the direction of "dpu for security" for many years, applying dpu to products such as firewalls and security gateways. it has now entered the sangfor network security product line and has become a standard expansion card, solving industry problems such as the insufficient elephant flow processing capabilities of intel cpus.

"judging from the current industry development trend, if technological development is in line with expectations, there will be an explosion around 2025-2027." the above-mentioned person from zhongke chuangxing said that the reason is that with the development of the digital economy, ai and cloud computing industries, the server market will usher in growth, especially in the fields of finance, government and power users. not only will a large number of dpus be needed to process data and improve computing efficiency, but dpus will also be required to take advantage of security.

"dpu chips have indeed been widely used, and the current growth rate is 20%-30% per year. but the industry characteristic of dpu is that it needs to maintain stability. it needs to run stably on the cluster for several months before expanding the cluster." zhang yu said that more importantly, considering the development of the domestic information and innovation industry, the next two to three years will be a very critical period, and it is a key time window that every dpu manufacturer needs to grasp.

"dpu is not yet a standardized product. the process of commercialization needs to be combined with market demand and in-depth polishing of different application scenarios. it requires collaboration between upstream and downstream manufacturers. it takes a long time to go from a small-scale pilot of a few hundred pieces to a large-scale deployment of tens of thousands of pieces." lu sheng said that the commercialization of dpu requires the joint efforts of partners across the industry to strengthen mutual trust and cooperation in the ecosystem and move forward hand in hand on the road to commercialization of the domestically produced 3u integrated cpu+gpu+dpu.

(this article comes from china business network)

report/feedback

news

"the third main chip" dpu: the next three years is the window period for commercialization

introduction

my contact information