news

five questions about the current situation of ai intelligent computing centers|industry survey

2024-10-02

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

financial associated press, october 2 (reporter fu jing)the parameter scale of large models continues to increase, placing higher demands on ai computing power infrastructure. the ai ​​computing industry is currently booming, and the construction of intelligent computing centers is accelerating. the latest data shows that as of june this year, the total number of computing power center racks in use nationwide exceeds 8.3 million standard racks, with a computing power scale of 246 eflops (fp32), and a year-on-year growth rate of smart computing exceeding 65%.

what is the price and shortage of supply-side computing cards at this stage? does the supply side match the demand side? are all the hundreds of intelligent computing centers across the country operating at full capacity? when will the cost of building an intelligent computing center be repaid? how can artificial intelligence computing power achieve high-quality development? focusing on the five core issues, reporters from the financial associated press interviewed many industry figures.

according to a reporter from the financial associated press, the previous supply shortage of ai computing power has eased, but the supply and demand are not completely matched, resulting in low utilization rates of some intelligent computing centers. although plans for intelligent computing centers by local governments and enterprises are common, the number that can actually be put into use may be lower than expected. some practitioners predict that some intelligent computing centers will be able to "recover their costs" in about three to four years. at the same time, the high-quality development of computing power is also valued by the industry.

the calculated card price is close to the seller’s cost line

“the tight supply of computing power is indeed a relatively common phenomenon in recent years. many people are rushing to buy products with good computing power and good ecological adaptability. from the perspective of users, of course they hope that intelligent computing power can better support applications." from the perspective of a computing server supplier, zhang dong, chief scientist of inspur yunhai, said in an interview with a reporter from the financial associated press.

zhang dong told reporters that the tight supply of intelligent computing is cyclical.

what is the current situation on the supply side? zhang yazhou, chairman of shanghai liuchi technology group and general manager of shanghai runliuchi technology co., ltd., a subsidiary of hengrun co., ltd. (603985.sh), told reporters from the financial associated press, “there is a supply of computing power in the market this year, unlike last year when there was a shortage. very well, the current prices of various computing cards are close to the cost line of sellers. now there are many people involved in various projects in the industry, and there are also many intermediaries. all those who used to make computers and equipment, and the ict communication industry are involved. but in fact, not many are actually done well.”

recently, some a-share cross-border intelligent computing players have revealed pressure, such as: lotus purple star, a subsidiary of lotus holdings (600186.sh), has signed some computing power service contracts.there may be a risk that the actual recovery of procurement costs may take longer than expected or may not be recoverable., as of august this year, lotus purple star is still in a state of loss, and there is uncertainty about whether it can make profits throughout the year; altron engine, a wholly-owned subsidiary of oya holdings (300949.sz), negotiated with its supplier runxin supply chain to sign the "calculation calculation plan". according to the "supplementary agreement to the power server procurement contract", the original planned number of 128 high-performance computing servers with embedded nvidia gpu chips was changed to 8.

according to zhang yazhou's observation, from june to august this year, gpu terminal prices continued to decline. "last year's projects were all digested in the first half of this year. there are two main situations for the projects being carried out this year: first, corporate research and development really needs computing power, which is mainly concentrated in large internet companies. second, some regions have received subsidies and energy quotas. waiting for supporting construction of intelligent computing center.”

it is understood that the market only saw a wave of "sweeping goods" in september, "mainly affected by the off-peak season and the international environment, but in fact there are not many spot resources in the market."

the reporter also learned from an industry insider that "the price of the 4090 has previously increased from more than 13,000 to 16,700." however, it is said that the price increase is mainly due to the relatively high demand for this graphics card in "black myth: wukong". powerful.

in addition, zhang yazhou said that the market still has the phenomenon of fragmented computing power supply: some suppliers "may only have 5 or 10 servers, and larger ones have 64 or more than 100 servers, and there are basically very few large-volume ones. such suppliers we may undertake some loose orders from laboratories and schools.”

supply and demand are not exactly matched

several practitioners told reporters from the financial associated press that the easing of supply shortage does not mean that the demand for intelligent computing is lower than expected. zhang yazhou said that the demand for intelligent computing is growing and new demands are constantly being generated, but the current demand side has become more rational.

fan congming, executive chairman of the shenzhen artificial intelligence industry association, talked about the current situation of different types of demand parties in an interview with a reporter from the associated press: leading companies and scientific research universities have sufficient computing power resources, while large-scale industrial vertical models are currently being developed in large quantities, and small, medium and micro enterprises there is a shortage of computing power.

it is worth noting that the construction of intelligent computing centers is in full swing, and related bidding projects are increasing month by month.

previously reported by digital intelligence frontier, according to incomplete statistics, in the first 7 months of this year alone, more than 140 bid announcements for intelligent computing center-related projects have been issued, including at least 24 projects, focusing on all aspects of construction such as civil infrastructure and it infrastructure. the winning bid amount exceeded 100 million yuan; more than 40 related winning bid projects were announced domestically in july.

guo liang, chief engineer of the cloud computing and big data research institute of the china academy of information and communications technology, said in an interview with a reporter from the associated press of finance during the "2024 china computing power conference" that just concluded, "many intelligent computing centers have been built across the country. according to complete statistics, there should be more than 200, but 90% of them have computing power below 1000p, which means that these computing power centers are of limited use for large model training, and their future use efficiency is questionable.”

"the demand for computing power is huge, but the existing types cannot meet user needs. both in terms of adaptation and cost-effectiveness, they cannot meet customer expectations." du yunlong, an analyst at idc china, told a reporter from the financial associated press.

zhang yazhou also believes that there is currently a situation where the computing power supply side and the demand side do not completely match. “b-side demanders generally look for units they are familiar with. there may be dozens of people coming to inquire about a project. in fact, they can only contact the project side.” transactions are possible only if the cooperation relationship is good or the comprehensive strength is relatively recognized, and it does not necessarily mean the transaction is at the lowest price.”

is idle computing power common?

a reporter from the associated press noted that at this stage, whether computing power equipment is operating at full capacity has become the focus of market attention.

"now a lot of computing power has been absorbed, but there is indeed a small amount of idle computing power in the industry. for example, there may be a supply of a thousand machines on the market, but there may be hundreds of machines idle." zhang yazhou told reporter from the associated press.

according to guo liang's observation, idle computing power is not a common phenomenon. "our team supports related work in many provinces and cities. in the near future, as far as we know, the utilization rate of ningxia's computing power center is still very high."

it is now more common to sell computing power in inner mongolia, tibet, and xinjiang online for time-sharing leasing at low prices. this will lead to low utilization rates of intelligent computing centers built in guangdong and other places." fan congming told reporters.

talking about the ningxia intelligent computing center, guo liang analyzed that although the local overall electricity price is currently not subsidized, it still has advantages; the local intelligent computing center has a larger computing power and is more useful for large model training. "in addition, for intelligent computing, the performance requirements for network transmission are not that strong, and data can be completely processed offline. this is a better application scenario for intelligent computing centers in central and western my country."

"judging from the degree of digestion of computing power in early construction, leading enterprises should account for 80%, scientific research universities should account for about 30%-40%, and the degree of digestion of computing power for market-oriented construction should be about half." fan congming told reporters.

according to fan congming's observation, leading companies such as byte, tencent, huawei, and baidu "continuously train large models. the larger the amount of data, the greater the demand for computing power, and there is almost no idle computing power." the computing power of scientific research universities "has been built "big, less used", relatively more idle computing power; idle computing power in small and medium-sized enterprises is relatively common. due to unclear positioning, remote location, and high price, the computing power digestion in the early construction is not enough. "

du yunlong believes that whether computing power is idle mainly depends on several aspects: the mobilization of computing power by upper-layer software, the interconnection method between hardware, the adaptation of hardware facilities to application scenarios, and user deployment costs.

in early september, tencent cloud vice president sha kaibo also talked about the phenomenon that even though the intelligent computing center has hardware resources, it still lacks supporting software capabilities, lacks actual end customers or application scenarios in an interview with a reporter from the financial associated press.

zhang yazhou said that the reasons behind the idle computing power are complex and related to the performance of various smart computing cards, the operation of project parties, and the technical service capabilities of network products. the core focus of the industry is whether there are actual products on the application side. out.

how many years will it take to pay back the cost of building an intelligent computing center?

idle computing power has led to excessive costs in some intelligent computing centers, which is one of the common concerns in the industry.

in this regard, fan congming believes that the supply price of intelligent computing centers is too expensive and users cannot afford it, which leads to insufficient continuity of computing in the intelligent computing centers and excessive costs.

talking about the operation of the intelligent computing center, guo liang told reporters from the financial associated press during the "2024 china computing power conference", "recently, you will often see information about the computing power scheduling platform on various occasions, but what will be the effect after it is built? like? it is understood that some places have invested tens of millions to build platforms, but due to issues such as design concepts and functional features, they have not been well utilized. the overall operation of our country’s intelligent computing center is particularly important.”

he further said, "for smart computing, even government investment needs returns, let alone enterprises. now the industry is indeed in a situation of disputes among heroes, but there is no unified role. of course, we are also working hard. this time the china computing service platform (henan) released at the computing power conference is a solution we launched.”

(photographed by a reporter from the financial associated press at the 2024 china computing power conference)

regarding how smart computing centers balance cost and performance, fan congming told a reporter from the associated press, "the payback period for computing power investment is generally about five years, because the computing power market changes so fast, and other costs such as electricity bills and operations must be added. if it can if it is used by a major manufacturer, i think the payback time will be around three to four years.”

du yunlong believes that operators of intelligent computing centers should formulate long-term plans to reduce end-user usage costs, cultivate usage habits, and resume pricing in the future; focus on cultivating application cases and gradually expand industry coverage.

what’s the explanation for changing from “quantity” to “quality”?

objectively speaking, china’s computing power development still has a long way to go.

the "china computing power development report (2024)" released by the 2024 china computing power conference shows that as of the end of last year, the global intelligent computing market had grown by more than 130% year-on-year, while the chinese intelligent computing market had grown by more than 60% year-on-year.

a reporter from the associated press noticed that amid the boom in computing power construction, some practitioners shared many "cold thoughts" at the above-mentioned conferences and focused on high-quality computing power.

the industry's first high-quality computing power evaluation system "artificial intelligence computing power high-quality development evaluation system report" released by inspur information (000977.sz) and the academy of information and communications technology stated that high-quality computing power is based on the latest artificial intelligence theory and uses advanced artificial intelligence computing architecture, high-level computing capabilities combined with algorithms and data depth.

a reporter from the associated press learned from inspur information that the gap between the measured performance and theoretical performance of the current computing power cluster is too large. the actual performance of some computing power is less than 10% of the theoretical performance. public data shows that the average gpu utilization rate of intelligent computing centers under traditional mode is less than 30%.

zhang dong believes that attention to computing power cannot only focus on chips. “many places buy computing power, name the brands of chips, and build a large-scale computing center. in fact, it is meaningless to look at chip indicators. we still have to look at the system perspective. comprehensively consider how to meet the application needs.”

guo liang also said, "currently, we don't have many choices at the chip level. but the integration of computing and network is a hot spot. the purpose is to expand an ai server from the current 8 cards to 32 cards or 512 cards. this will it is beneficial to the capabilities of intelligent computing clusters, including cluster deployment, launch, and operation and maintenance.”

reporters from the associated press learned from multiple interviews that the implementation of large-scale intelligent computing is by no means a simple stacking of scales and quantities. its complexity increases exponentially, which places high demands on the technical strength, resource advantages, and industrial collaboration capabilities of intelligent computing construction operators. .

as for how the intelligent computing center can transform from "quantity" to "quality," guo liang said, "the construction of an intelligent computing center requires 'moderate advance' and overall analysis and prediction based on actual local needs."

(financial associated press reporter fu jing)
report/feedback