news

Computing power is power. Why did Nvidia become the "Silicon Valley dragon"?

2024-08-06

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

This article comes from: Appearance and Inside, author: Zhou Xiao, Tan Jiuyun, Xie Fangchen, editor: Fu Xiaoling, Cao Binling

"Selling shovels" is a good business, but not necessarily so if you always sell to a few miners and are constantly exposed to "key customer risk."

Even though Nvidia is already the leader in AI chips, Huang Renxun "wakes up every morning worried that the company will go bankrupt."

It can be seen that Nvidia's customers have always been very concentrated, with CR5 accounting for more than half of the total revenue. At its peak in 2002, the top five customers accounted for 65% of its revenue, and the top two customers accounted for an astonishing 40%. After Microsoft, the major sponsor at the time, withdrew its order, Nvidia's performance plummeted and its stock price plummeted by 90%.


A similar "big customer crisis" occurred in 2008 when the three major PC giants, Apple, Dell, and HP, collectively stopped orders, and later when benchmark customers such as Google and Tesla developed their own chips.


To this day (fiscal year 2024), Nvidia's top two customers still account for 32% of its revenue, and the top five are estimated to be close to 50%. So much so that one of the reserved items for Wall Street analysts is to calculate the concentration of its major customers every time Nvidia releases its financial report.

However, although the super financial backers hold Nvidia's lifeline, they no longer dare to turn the table easily.

Those giants who forced Nvidia to lower prices by withdrawing their orders have suddenly become Nvidia's followers. Every time Boss Huang releases a new GPU, they line up and pay a higher price to buy it, regardless of the cost.

So, how did Invenda go from being manipulated to being the master?

This article reviews the three rounds of struggles with major customers and finds that Nvidia's history is a concentrated history of technological and industrial evolution. From temporary leadership to absolute leadership, from being replaceable to irreplaceable, it has gradually regained the right to speak, until it became the "Silicon Valley Dragon" of the new era.

1. Act as a "jack of all trades" for PC games, standing on the shoulders of customers

2002 must be an unforgettable year for Huang Renxun.

This year, Nvidia's stock price plummeted 90%, and his personal wealth shrank 10 times, falling directly from a "billionaire" back to the ranks of "multi-millionaires".

His pain came from his big money sponsor Microsoft: the two companies' collaborative Xbox game console project had reached the point of a price war, and Microsoft asked Nvidia to lower the price of its GPUs. But Nvidia was also making chips at cost at the time, so it naturally refused to cut prices. Microsoft angrily broke off the cooperation between the two parties.

You should know that this contract with Microsoft accounts for nearly 70% of Nvidia's annual sales. After the cooperation broke down, Nvidia's revenue shrank rapidly and its market value plummeted.


Having experienced the pain of being bullied by customers, Huang Renxun had to compromise and agreed to "reduce the future costs of Xbox."

However, six years later, faced with the "graphics card gate" incident in which Apple, Dell, and HP collectively withdrew their orders for the GeForce 6000-9000 series chips and their stock prices plummeted by 95%, Boss Huang chose to offend his customers to the end.

He lightly mentioned the problem of the graphics card causing the computer to overheat and burn out, and later when Apple took the initiative to approach him and asked for customized cooperation, he rejected it outright.

The two completely different attitudes are largely due to the shift in discourse power.

In the console era at the beginning of the century, game console manufacturers controlled the "traffic entrance", and the distribution and operation of games were all handled by giants such as Sony and Nintendo, and chip suppliers could usually only "live on them".


Even though NVIDIA pioneered the concept of GPU and has been leading the graphics card industry - starting with the epoch-making product Geforce256, graphics card technology has continued to iterate rapidly, and even though it holds the key to the transition from 2D to 3D in the gaming industry, it cannot escape the fate of being decided by a single blow.

After breaking up with Microsoft, Nvidia missed out on Microsoft's new specification DirectX9, which resulted in the new GeForce FX being incompatible with Microsoft standards. Coupled with the immaturity of the product itself, sales were dismal.

ATI, which was supported by Microsoft, launched the Radeon 9700, which beat Nvidia's GeForce FX and quickly rose in the GPU market. In the third quarter of 2004, ATI had a market share of 59% in the independent graphics card market, while Nvidia only had 37%.

However, as time went on to the PC game market, the monopoly of game console manufacturers on "traffic entrance" was broken.

In this field, the core of the industry lies in the game itself, and it tends to be "game manufacturers run at will, and hardware companies chase desperately". The industry has its own division of labor in R&D, distribution, hardware, and software.


This means that compared to the handful of Sony, Nintendo, and Microsoft in the era of game consoles, the potential customers of chip manufacturers have suddenly expanded to every corner of the industry chain.

As shown in the figure below, NVIDIA's revenue continued to grow from 2004 to 2006, while the proportion of revenue contributed by its top five customers continued to decline.


At the same time, this wave of "true 3D" driven by graphics card technology, coupled with the PC online boom brought about by the popularization of the Internet and broadband, has also put forward higher requirements on the chip capabilities of the upstream and downstream of the industrial chain.


Take the consumer side as an example. These games with cool graphics are also called computer killers because of their complex 3D modeling. Players keep upgrading their computers. Their pursuit of device performance forces PC manufacturers to choose chips that are known for their performance.

For example, the game "Crysis" swept the world in 2007. Its realistic graphics made people love it, but it was extremely demanding on the performance of the graphics card. One player recalled, "I spent a whole day fighting the final boss during my college summer vacation. After each nuclear bomb was fired, I had to stop and cool the computer."


This made Nvidia's Geforce 9800GT graphics card, which can run "Crysis" smoothly, very popular, and more than 5 million units were sold worldwide.

To put it bluntly, compared to the console industry, as long as the upstream of PC games strikes a balance between performance, yield and price, they can rely on their products to speak for themselves, without having to be a "humble" party.

It can be seen that since 2004, NVIDIA GPU's formula of "each round of update, doubled performance and continuous price reduction" has been popular in the PC gaming market.


However, it is not only Nvidia that has its eyes on this piece of cake; other chip giants are also unwilling to lag behind.

Veteran CPU players such as Intel and AMD quickly entered the GPU market, and their arch-rival ATI also kept pace. AMD even announced a merger with ATI in 2005, hoping to dominate the industry through the powerful combination of CPU and GPU.

A fierce battle was imminent, but ATI unexpectedly bought a large number of outdated GPU technology patents in order to sell them at a good price. This not only caused AMD to be in debt year after year, but also slowed down the pace of GPU integration. AMD was lost in the crowd in the continuously iterating chip industry.

Nvidia is quite proud of its rivals' lagging behind. As Huang Renxun boasted, "It's like a gift from heaven. We have become the only independent graphics chip company in the world."

Nvidia, with its technological expertise and dominant position, has since gained the ability to compete with its customers.

For example, Dell and HP, which withdrew their orders in the "graphics card gate" incident mentioned above, placed orders again after Nvidia launched new products; after turning to AMD, Apple had to bite the bullet and restart its cooperation with Nvidia due to poor results.

But before Boss Huang could laugh for long, a new crisis quietly arrived.

2. Stick to GPU universality to make it inevitable for major customers

One of the famous deeds of super salesman Huang Renxun was that he claimed to be a Mi fan at the Xiaomi conference in 2013 and shouted "Please give me a chance to introduce Nvidia!"

At that time, the onlookers were all impressed by Boss Huang's flexibility and thought that Nvidia was determined to fight for the mobile chip market.

Unexpectedly, the next year, Huang Renxun gave up the opportunity he had been licking and announced that he would "gradually withdraw from the smartphone and tablet chip market."

Huang Renxun's explanation at the time was that the fierce price competition among mobile phone manufacturers required upstream companies to continuously lower prices, but price was not Nvidia's strong point, so it had to withdraw.

But today we know that the price war is only superficial. The deeper reason is that the demand for mobile chips is contrary to NVIDIA's pursuit of extreme performance.

The success in the PC market has made Huang Renxun clearer than anyone that in the field of chips, companies with leading (or even unique) technologies can often get a bigger piece of the pie and have a deeper moat.

Therefore, for mobile chips, NVIDIA has implemented the "technology-driven" approach: the Tegra series has been madly piling up computing power. From Tegra 2 to Tegra 3, the number of GPU cores has increased from 8 to 12, and Tegra 4 has increased to 72. The entire series is regarded as a powerful tool for running scores.

However, mobile chips installed in smartphones and tablets need to work in limited space and power, and place more emphasis on balanced configuration. Nvidia chips, which compete on performance, soon experienced problems with uncontrolled energy consumption and severe heat generation.

More importantly, Qualcomm has a firm grip on communication signal technology. If mobile phone manufacturers want to cooperate with Nvidia, they will have to spend a lot of money to make their own basebands, so they have to stay away from it.

In this case, if NVIDIA wants to continue pursuing mobility, it will have to compromise on performance, which is not in line with the style of "technology fanatic" Huang Renxun. He decided to "refocus on high-performance, visual computing or gaming-oriented devices."

But the market did not buy it. At that time, PC gaming, Nvidia's main market, had ended the accelerated penetration stage, and the entire Internet was in a hot mobile shift period. The failure in mobileization not only put Nvidia's business into a period of transition, but also meant that it gave up a booming market in pursuit of technology.


Nvidia has done this more than once, the other time being back in 2006.

At that time, in the eyes of the world, GPU was just a gaming equipment. Huang Renxun happened to see some people doing high-frequency trading and financial quantification on Wall Street using NVIDIA GPUs to run transactions, but they could only write a large amount of low-level machine code to implement it, and could not program in languages ​​​​such as C++ like CPUs.

This made Huang Renxun realize that the market had a certain demand for general computing scenarios, so he went back and clamored to increase investment in software development (that is, the CUDA platform) to allow the GPU to be competent for various tasks, rather than just being used to draw images in the gaming field.

Once the project is launched, the annual R&D cost is expected to reach US$500 million, while Nvidia's annual revenue was only US$3 billion at that time.

Moreover, there was no obvious market demand for the huge general computing power brought by parallel computing at the time. It was used at most in unpopular fields such as advanced physics laboratories and quantitative trading. The number of GPUs required for a single project was limited to single digits. Wall Street once valued CUDA technology at zero.

In other words, Boss Huang bet one-sixth of the company's revenue on a software platform that had almost nothing to do with the core business and had an uncertain future.

Not only does it cost money, but in order to adapt to the CUDA platform, Nvidia's chips need to add more logic circuits, which increases the chip area, increases heat dissipation requirements, and significantly increases the failure rate.

The order cancellations by HP, Dell, Apple and others mentioned above were due to severe overheating of Nvidia chips, which caused a large number of laptop computers to crash and malfunction.

But Huang Renxun firmly believed that "this way of making software can change everything". Even though he had to pay a huge sum of 200 million yuan to compensate customers, he still invested in CUDA almost paranoidly.

As it turned out, he made the right bet.

At the ImageNet competition at the end of 2012, the convolutional neural network AlexNet, based on NVIDIA chip computing, increased the recognition accuracy to 84%, thus opening up the AI ​​revolution of the next decade.

This also made Nvidia's GPU+CUDA combination a blockbuster in the field of deep learning. Google, which once mocked that "GPU is only used for playing games", instantly became a fan of Nvidia, and giants such as Microsoft and Facebook also placed orders for a large number of GPUs for artificial intelligence training.

And this is just the beginning. With continuous breakthroughs in the field of deep learning, giants are gradually "revolving" around NVIDIA in multiple scenarios.

For example, in 2015, when the error rate of deep learning in the field of image recognition was lower than that of humans, the autonomous driving market exploded. On the vehicle side, heat and energy consumption were no longer a problem, and the Tegra chip that was disliked by mobile phone manufacturers was favored by car companies.


In 2016, AlphaGo defeated Lee Sedol on behalf of AI, which ignited the entire B-end market.

It can be seen that major manufacturers have begun to provide a series of AI service products such as image recognition, authentication, and retrieval based on deep learning to B-side customers in various industries.


In addition to the surging demand for GPUs in industries such as the Internet and autonomous driving, companies in the fields of biomedicine and quantitative trading that can apply AI technology have also joined the ranks of Nvidia's customers.

Data shows that as of 2016, NVIDIA's market share in the field of deep learning reached 97%.

At this point, NVIDIA, which has been obsessed with GPU and CUDA performance, has finally waited for the moment when "computing power is power" is realized: major customers cannot avoid NVIDIA in any of their business, and thus have lost the qualification to overturn the table.

This increase in voice has brought great benefits to Nvidia, but it has also laid hidden dangers.

3. Major customers who develop their own chips are being controlled by Nvidia

In 2017, Nvidia's "ban" sparked outrage in the global technology community: customers were no longer allowed to use GeForce products to conduct deep learning in data centers.

At that time, Nvidia was promoting its high-end new product Tesla series, which had the same architecture as the GeForce series, but with a huge price difference, with the top version being nearly 10 times more expensive. The ban meant that customers could no longer use GeForce to replace Tesla, and startups with limited financial resources might even lose their livelihoods.

Amid protests, the "dragon's actions" were eventually stopped, but technology companies became increasingly wary of Nvidia.

Looking back at those years, Internet giants such as Google, Meta, and Microsoft, and electric car companies such as Tesla, have successively started to develop their own chips in an attempt to reduce their dependence on Nvidia GPUs.


In addition to disliking the "Nvidia tax", developing one's own chips is also more attractive than purchasing them from Nvidia.

Taking the automotive industry as an example, the part of the vehicle that requires the most computing power is the intelligent driving system, but NVIDIA chips are general-purpose "hexagonal warriors". Not only is the computing speed unsatisfactory, but the size also takes up a lot of space.

Tesla's self-developed chips are specially designed for autonomous driving. For example, Tesla's FSD chip is much smaller in area than the palm-sized NVIDIA DRIVE PX 2, and its energy consumption is reduced to one-third.

The FSD chip also prioritizes the NPU responsible for deep learning and prediction. Compared with the GPU, it is more efficient in AI machine learning and has a 5-fold increase in computing power.


The same is true in the pan-AI field. Chips are adapted and transformed from architecture, upper-level operating systems, middleware to business codes in pursuit of ultimate performance. This is the case with Google's TPU chip and Amazon's Graviton chip.

Among them, Google TPU is 15-30 times faster than NVIDIA GPU in the field of inference computing, and its performance-power ratio is about 30-80 times higher.


In other words, self-developed chips can not only be customized, but also crush Nvidia's general-purpose chips in terms of performance, size, power consumption, etc.

What's more, although the current chips are said to be "self-developed", they do not require full-stack research and development. Take Tesla as an example. In addition to the self-developed NPU, the rest of the CPU, GPU, interface, etc. are purchased from standard IPs, and the development was completed in just 18 months.

Therefore, after major customers started the self-development craze, Nvidia has been caught in the question of being "disrupted". But although major customers have the courage to make chips, it is not easy for them to replace Nvidia.

As we all know, the fixed costs of chip manufacturing, such as R&D design, site equipment, etc., are incredibly high. Optical tape-out is a considerable expense. The one-time tape-out cost of a 7-nanometer chip is about 200 million yuan.


Based on this, this industry has always been centered on "cost scaling": for R&D parties, the more scenarios in which chips can be used, the more they can spread the huge R&D costs; for manufacturers, the more one-time orders they receive, the more they can fully mobilize production capacity utilization and maintain shipment profits at a certain level.

It can be seen that NVIDIA, with its universal architecture, can spread out its R&D expenses for a single scenario. For example, the cost of the Hopper architecture used in the automotive chip Thor may have been spread out by the H100 with the same architecture.

This is obviously not available to large customers who use their own chips for special scenarios. As a result, the huge R&D costs may discourage them.

The same is true for manufacturing. An organization once did the math: Assuming that a car company produces 1.2 million vehicles per year and uses 1.2 million high-computing chips, that's 100,000 chips per month. If each 12-inch wafer contains 500 chips, there's only a demand for 200 wafers per month.

With such a small demand, TSMC can only be considered the smallest customer, and the same is true for the packaging and testing factory. When the Fab and packaging and testing capacity are in short supply, they will give priority to ensuring the supply of large customers and extend the waiting time for orders.

Coupled with the chain reaction of testing and other links, the actual application and landing cycle of chips may become longer. For example, Tesla's FSD chip completed its first trial production at the end of 2017, but it was not installed in vehicles until two years later, and the update iteration was slower. In comparison, Nvidia can basically update its automotive chips every 18 months.


This forced many companies to return to Nvidia's arms, especially after the craze for large models and generative AI hit, computing power became a must-fight resource. Even large companies that are still focusing on their own research and development dare not give up purchasing Nvidia's advanced GPUs.


Nvidia's computing chips have become an absolute hard currency. Rumor has it that the founder of Oracle and Musk once begged Huang Renxun for an hour in a Japanese restaurant just to get GPU chip supplies.

It is reported that Nvidia can earn 1000% profit for every H100 sold. 50% of the profits in AI-related industries flow into Nvidia's pocket.

Faced with the increasingly severe "Nvidia tax", major customers are gritting their teeth, but are helpless - the current chip competition focuses on both software and hardware, and Nvidia not only has chips, but also has the software moat CUDA.

For more than a decade, CUDA has been open to developers as a function library and code library, and has attracted millions of developers. They have developed more tools to make the CUDA ecosystem more mature and almost become an "infrastructure".

If the GPU is compared to a power plant, then developers are like people who make electrical appliances, and CUDA is like the "power grid system", and the voltage specifications of electrical appliances are adapted to the power grid.

This means that CUDA is difficult to replace. Last year, in order to break CUDA's monopoly on software-chip co-design, Nvidia's major customers launched the "Anti-CUDA Alliance" and tried to create a CUDA-compatible compilation tool.

However, by the time the solutions of third-party software companies reached a level close to CUDA, NVIDIA had already released the next-generation GPU. The software and hardware went hand in hand, leaving challengers in a state of catching up.

What's more, NVIDIA, which has sensed danger, has begun to build high walls: first, it announced that "buy server racks and get GB200 first", and bundled and sold expensive customized racks, increasing customer migration costs; then it tightened the decompilation policy, blocking the way for other tools to be compatible with CUDA.

As a result, those big customers who develop their own chips will eventually become meat on Nvidia's chopping board.

summary

"Believers of Nvidia in the domestic graphics card forums once evaluated Huang Renxun in this way using the style of "Records of the Grand Historian".

With the support of many trends, just as Huang Renxun firmly believes, computing power has really changed the world. His beliefs have been widely recognized, and his power can even dominate the market.

But those who create history can never foresee their own position in the course of history. This means that for Huang Renxun, every day is a state of "being prepared for battle".