news

who has the most gpus?

2024-09-17

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

with the advent of the artificial intelligence era, gpu has become the focus of everyone's attention.

however, to maximize the ai ​​training and reasoning capabilities of gpus, we must also rely on the power of data centers. an ai data center is often equipped with tens of thousands of gpus. it is through this synergy that powerful chatbots like chatgpt are possible.

however, the price of ai data centers is not low. the construction cost can easily reach billions of dollars, which is not only exclusive to technology giants, but also prohibitive for many countries and regions with weaker financial strength.

as more and more things can be done using artificial intelligence, the geopolitical importance of high-end chips is growing. more and more countries and regions are competing to stockpile chips, and even enacting sanctions to prevent some countries and regions from purchasing the most advanced chips.but until now, public data on the exact location of ai chips around the world has been surprisingly scarce.

vili lehdonvirta, a professor at the oxford internet institute, revealed a reality that cannot be ignored: gpus are highly concentrated in only 30 countries and regions in the world, with the united states and china leading the way, while most regions are in so-called "computing deserts": there are no gpus available for rent at all.

how to investigate distribution

the global ai computing supply chain can be roughly divided into four parts:

companies that design and sell gpus and other ai-related chips

companies that manufacture and package chips

companies that deploy chips to provide computing power

companies that consume computing power to develop or deploy ai systems

the market leader in gpu design and sales is the us-based company nvidia, chip manufacturing is dominated by taiwan’s tsmc, and asml of the netherlands is currently the only company that produces the photolithography machines that are essential for making the most advanced chips (miller 2022). as a result, these parts of the computing supply chain are highly concentrated in terms of both geography and ownership.

the study focuses on the third step of the supply chain: where in the world are chips deployed to provide ai computing for ai development and deployment, that is, for training ai models and running inference on existing models. broadly speaking, there are three types of large-scale computing providers: scientific supercomputing facilities, private computing clusters, and so-called public cloud computing providers.

scientific supercomputing facilities have existed since the early 1960s, usually funded by governments and mainly used for academic and military purposes. a study by oecd (2023) conducted a simple geographical analysis of scientific supercomputing facilities. according to the top500 database, china has the largest number of supercomputers, accounting for 32%; followed by the united states, accounting for 25%; and the european union, accounting for 21%. however, most scientific supercomputers are not designed for ai model training (oecd 2023). the current boom in generative ai development mainly relies on private computing clusters and public cloud computing. previous studies have not analyzed their geographical distribution in detail.

private computing clusters are owned by for-profit companies, such as meta, hp, and many smaller companies. these clusters consist of interconnected computers with gpus deployed in data centers. a private cluster can be used for ai development at that company or rented out to other companies. public cloud computing providers are also for-profit companies. they are called "public" not because of any connection to the government, but because their services are provided on demand and shared by multiple customers (i.e., similar to the meaning of "public" in a pub, not "public" in the public sector). leaders in the public cloud computing market include aws, microsoft azure, and google cloud; chinese public cloud providers alibaba and tencent also provide large-scale ai computing. these large providers are often called "hyperscale computing providers."

the study focuses on the geographic distribution of public cloud ai computing. private computing clusters have been used to train some iconic models, such as meta's llama and llama 2. but the training and development of a large number of cutting-edge ai models are concentrated in the public cloud's hyperscale providers google, microsoft, and amazon, and their "computing partnerships" with leading ai companies, such as anthropic, cohere, google deepmind, hugging face, openai, and stability ai. the public cloud is also important because it is open to many different types of developers, including academic researchers. therefore, our main research question is: what is the geographical distribution of public cloud ai computing around the world? we will also explore the potential reasons for these geographic distributions, discuss their implications for computing governance and geopolitics, and finally briefly discuss private clusters and government-owned national ai computing.

the study’s census covers the six largest hyperscale public cloud providers: aws, microsoft, google, alibaba, huawei, and tencent. while there are some smaller providers, these six account for the majority of the global public cloud market and are also leading in regional markets. at the time of the census, the most powerful gpu for training common ai models was nvidia’s h100, launched in 2023, following the previous flagship model a100, launched in 2020, and the v100, launched even earlier in 2017. in 2023, nvidia introduced the h800 and a800 to circumvent u.s. export restrictions on china, but those restrictions were soon extended to these new models. the data collection focuses on these five gpu types that are most relevant to ai.

from the census database, the study constructed a national-level dataset to allow for geographic analysis.for each country, we counted the total number of public cloud regions within its territory, the subset of regions that support at least one type of gpu (“gpu-enabled regions”), and the subset of regions that support a specific gpu type.

to supplement the cloud census data, the study conducted qualitative and semi-structured expert interviews. in total, we interviewed 10 informants, representing two policy experts, three hyperscale public cloud provider experts, and five research experts with expertise in ai computing. these informants were recruited using snowball sampling through our own professional networks. the main goals of these interviews were to improve and validate the census methodology, generate complementary or alternative information on the geographic distribution of public cloud ai computing, and help explain the observed geographic patterns.

where are ai gpus?

figure 1 shows the approximate locations of the public cloud regions found in the census. table 4 shows how many cloud regions there are in each country and how many of those regions offer gpu instances. from a computing governance perspective, one of the most important features in the data is that the vast majority of countries in the world have no public cloud regions at all. of the 39 countries that have one or more cloud regions, 30 have cloud regions that support gpus.

another notable feature is that even within those countries that have gpu-enabled cloud regions, the geographic distribution of regions is highly polarized:china and the united states combined have nearly as many regions (49) as the rest of the world combined (52). of the two, china has slightly more total gpu-supported regions (27) than the united states (22).

further analysis can be done by looking at the gpu instance types offered in each country. the most obvious pattern is that the united states has the world's newest and most powerful gpus not only in terms of the proportion of different instance types available, but also in absolute numbers. the united states is the only country that has more regions offering 2020 nvidia a100 gpus than 2017 v100 gpus. the united states also has multiple regions offering 2023 nvidia h100 gpus. china's cloud regions are mainly based on v100, with a few regions offering a100 instances. china has no regions offering h100. the rest of the world has only 15 countries offering a100, only one country offering h100, and the rest of the regions are purely based on v100.

this analysis does not take into account custom accelerator chips (such as tpus) or differences in the number of gpus available in different regions. interview informants noted that the number of gpus of the same type available in different regions can vary significantly between regions and providers. one informant noted: "hyperscale cloud service providers almost give the impression of being omnipotent in terms of computing or storage, and seem to be able to handle any problem you bring. but this is not entirely realistic." in some cases, the number of gpus available in a region may be very limited, resulting in only a limited number of customers being able to run gpu instances in that region, or only being able to train smaller-scale models in a reasonable amount of time.

aws and microsoft are currently thought to have the largest cloud gpu clusters, but "it definitely varies between regions in this regard."however, the number of gpus and their distribution across provider regions is considered highly confidential information by hyperscale cloud providers. none of our informants were willing or able to provide specific data, nor could they indicate how to publicly obtain this information. but it is generally believed that the number of gpus in the us region is likely to be much larger than other regions in the world with comparable gpus. the chinese region may also have more v100 chips to compensate for its relatively lower performance. our interviews suggest that even if the number of gpus per region could be included in this analysis, it would probably not challenge the main patterns described above, but would more likely reinforce them.

why focus on the united states?

what’s behind the u.s. lead over china and other countries in advanced public cloud ai computing? one obvious explanation is u.s. government export controls, which prohibit the export of a100 and h100 chips to china. cloud providers in china have been able to import some a100 chips before the export controls take effect in 2023, but the h100 has been subject to export controls since the product was released. similarly, the h800 and a800 chips were also subject to export controls shortly after they were launched. the v100, which is far less powerful than these chips, is the most common nvidia gpu instance type in china because it is not subject to export controls.

however, export controls cannot explain why other countries besides china have also deployed mainly older gpus. several explanations are possible. a simple explanation is the friction of innovation diffusion, which refers to the process by which gpus spread across the market. newer gpus may have been installed first in the united states, where nvidia is headquartered and therefore has the strongest distribution network. over time, advanced gpus should have spread to relatively distant markets. "i assume that almost all gpus initially went to north america, but now there should be a sizable cluster in europe as well," speculated one informant.

another potential explanation for the u.s.’s lead in cloud computing comes from geographic differences in the initial demand structure, which, combined with economies of scale, creates a “path dependency” that maintains the concentration of ai computing in certain geographic areas. one informant explained:“very few cloud buyers are really doing groundbreaking ai development… so there’s no point distributing capacity all over the place… you need a few super clusters to create a critical mass of computing capacity in certain locations, but there’s no point replicating that capacity everywhere.”

the earliest companies and researchers to concentrate on large-scale ai model training appeared in the united states, so cloud providers concentrated the most powerful training computing power there. but even if the demand for computing is increasing elsewhere in the world, this does not necessarily translate into a corresponding increase in local computing infrastructure, because developers can usually send training tasks to cloud regions in the united states without experiencing significant performance losses. as a result, the united states' initial computing lead has continued.

the informant believes that the situation is different for computing power used to deploy ai. in many ai use cases, such as voice assistants, the user experience may be affected by latency if the distance between the user and the server is too large. data transmission costs may also become a business issue. therefore, such applications are best deployed on computing infrastructure closer to the user. this also explains why v100 chips, which are not powerful enough for training - although slower but still suitable for inference tasks - are distributed more evenly around the world than more advanced chips.

however, there are some exceptions that don't fit the general pattern of the united states having the most advanced gpus.japan, the united kingdom, and france each have the same number of a100-supported regions as v100-supported regions. these countries all have significant local ai development activity.there may be regulatory or political barriers that prevent local developers from sending data to the u.s. for training. one informant noted: “right now, there are public sector or important european players that need to train gpt-4-level models with data that cannot leave europe… i would be surprised if the hyperscale cloud providers did not respond to this need.”

in this context, informants mentioned policy discussions about "digital sovereignty," "data sovereignty," and "computing sovereignty," which could create an increase in demand for local training compute. the netherlands and ireland also have small but relatively advanced gpu lineups. this may be related to these countries' strategic position as infrastructure hubs for some hyperscale cloud providers. notably, the netherlands is the only country outside the united states with a cloud region with h100 gpus.

global distribution of private and government computing

this study focuses on public cloud computing, an important but not the only source of compute. in public cloud computing, our data collection focuses on nvidia gpus and the six leading hyperscale cloud service providers.

will the relative positions of different types of large-scale compute providers change, challenging the currently observed geography of compute? gpu clusters are expensive capital goods that require high utilization to achieve a reasonable return on investment, which explains why large-scale clusters are primarily built as shared infrastructure, either government-owned (such as scientific supercomputing) or, in recent years, privately owned (such as public clouds). government-owned compute appears to be making a small-scale return around the world in the form of “national ai compute” initiatives. for example, the national ai resources (nair) working group in the united states aims to create public computing infrastructure to “democratize ai research” (. however, in many cases, the scale of government investment does not seem to be sufficient to truly challenge the dominance of hyperscale cloud service providers. many recent government efforts have also been conducted in collaboration with these hyperscale cloud providers, and in fact these projects rely on private infrastructure.

the new lumi supercomputer at the european high performance computing consortium provides a counterexample.located in kajaani, finland, lumi was built in collaboration between eu member governments and consists of a cluster of 11,912 gpus designed by nvidia’s competitor amd. its scale could make it a serious alternative to private “public” cloud computing infrastructure for ai development infrastructure. given its location in the eu, it does not challenge the north-south computing divide shown in figure 2. however, it may help break the bipolar image of the united states and china as the only ai superpowers.

new private computing clusters are also growing. google's tpus may account for a significant proportion of ai computing. aws and microsoft both plan to produce their own chips. meta announced a massive investment in building private computing power: ceo mark zuckerberg claimed to invest in 340,000 nvidia h100s and a100s. in 2023, microsoft claimed to have spent hundreds of millions of dollars on clusters that power openai's chatgpt chatbot. large technology companies may be able to achieve high utilization of large clusters simply due to their internal and partner needs. but clusters initially deployed as private may be transformed into shared cloud infrastructure after internal demand decreases.this blurs the distinction between private and public (such as public housing) cloud computing capabilities.

an ai computing gap

governing ai through computing is a powerful idea because computing consists of large, observable physical infrastructures. these infrastructures must be physically located somewhere and are therefore susceptible to territorial jurisdiction, which is the most enforceable form of governance for all countries, large and small. however, research shows that computing infrastructures are not evenly distributed around the world, and their geographic distribution largely determines the likelihood that different countries will use computing as an intervention point for ai.

the research recreates the familiar view of the two ai superpowers locked in a computing “arms race,” with the united states holding an advantage in chip quality while china tries to make up the gap through quantity.u.s. export restrictions on advanced gpus appear to have had an effect, as no public cloud provider offers the 2023 h100 chips in china, nor the h800 or a800 that were developed to circumvent these restrictions. similarly, russia and iran, two countries subject to western sanctions, do not have any public cloud ai computing facilities in our sample.

however, in addition to the perspective of geopolitical great power competition, the study also proposed other conceptual categories related to computing-based ai governance. in addition to the united states and china, there are another 15 countries that also have the gpus that are most important for ai development, namely a100 and h100. these first-tier countries, with the exception of india, are all located in the so-called "global north". by analogy, they are called the "computational north". these computational north countries can use their territorial jurisdiction to intervene in ai development, especially when models are sent to their local public cloud regions for training. for example, they can require algorithms and data sets to be audited and certified to comply with local rules before training begins, thereby influencing the types of ai systems that enter the global market.

the second tier includes 13 countries whose computing power is more suitable for the deployment of ai systems than for the development of them. with the exception of switzerland, these countries are all located in the global south, so they are called the "compute south." for example, there are five gpu-supported cloud regions in latin america, but none of them are equipped with gpus more powerful than the v100 released in 2017. these countries are able to use their territorial jurisdiction over computing to gate which ai systems can be deployed locally, but have less influence on the development of ai systems.

in addition to the “compute north” and “compute south,” there is a “compute desert,” a term used to refer to all the countries in the world that do not have any public cloud ai computing (either for training or deployment).for these countries, moving to cloud-based ai services means relying on infrastructure developed and deployed in foreign jurisdictions. compute deserts include some wealthy countries, but also all lower-middle-income and low-income countries classified by the international monetary fund (imf). the impact of a compute desert country may vary depending on its level of wealth. rich countries in a desert may be able to use their other advantages—such as diplomatic influence over countries in the computational north and wealth sufficient to build government-owned computing capacity—to offset their lack of local public cloud ai computing, but poor countries in a compute desert have little prospect of influencing ai through computing governance.

similar to the global compute divide observed by researchers as a “computing divide” between academia and industry, the geographic distribution of public cloud ai computing appears to be reproducing familiar patterns of global inequality. starting in the mid-1990s, discussions about digitalization proposed that successful entry into the new global “knowledge economy” would be based on immaterial assets such as knowledge and creativity, rather than on the material assets and resources required in the industrial economy era. this meant that developing countries could skip expensive infrastructure investments and move directly into a knowledge-based economy. however, today’s discussions about ai have re-emphasized the critical role of physical infrastructure such as chip factories, data centers, and power grids for national competitiveness. if computing becomes a key governance node, then these physical infrastructures may also prove essential to maintaining independent regulatory power (lehdonvirta 2023). therefore, a country’s computing power is in some ways equivalent to its political power.

will this change? if the concentration of high-end ai computing in the united states and the "computing north" is simply due to friction in the diffusion of innovation, then over time the world may gradually fill up with computing power, narrowing the gap. nvidia's competitors, such as amd and intel, are catching up in chip performance. chinese vendors are also developing ai processing chips, and due to us export controls, there is huge domestic demand for them in china, coupled with government support, and the gap may gradually close.

however, if the observed geographic patterns are explained more by path dependencies resulting from first-mover advantages and economies of scale, then geographic concentration, regional specialization, and the international division of labor may become enduring features of computational production, as they are in many other industries.

final thoughts

who has the most gpus? the answer to this question seems to have been revealed long ago, but behind this question is essentially the uneven distribution of computing power. how to improve the imbalance of computing power and let more people in computing deserts enjoy the convenience brought by ai is probably difficult to solve in a short period of time.