news

Apple officially announced: The model supporting Apple Intelligence is trained on Google’s custom chips

2024-07-30

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Author: Li Dan

Source: Hard AI

Public documents show that Apple's development of its own artificial intelligence (AI) system, Apple Intelligence, is inseparable from the support of Google's customized chips.

On Monday, July 29, Eastern Time, Apple's official website released a technical paper detailing some basic language models developed to support Apple's personal intelligent system Apple Intelligence, including a model with approximately 3 billion parameters for efficient operation on devices - the end-side "Apple Foundation Model" (AFM), and a large server language model designed for Apple's cloud-based AI architecture "Private Cloud Compute" - Server AFM.

In the paper, Apple introduced that the client-side AFM and server-side AFM are members of the generative model family developed by Apple, and these models are used to support users and developers. In the paper, Apple disclosed that the training model uses the fourth-generation AI ASIC chip TPUv4 and the newer generation chip TPUv5 developed by Google. The article wrote:

“We trained the server AFM from scratch on 8192 TPUv4 chips, using a sequence length of 4096 and a batch size of 4096 sequences, for 6.3 trillion tokens.”
"The on-device AFM is trained on 2048 TPUv5p chips."

In this 47-page paper, Apple did not mention Google or Nvidia by name, but stated that its AFM and AFM services were trained on "cloud TPU clusters." This means that Apple rented servers from cloud service providers to perform calculations.

In fact, during the Worldwide Developers Conference (WWDC) in June this year, the media has discovered in the details of the technical documents released by Apple that Google has become another winner in Apple's efforts in the field of AI. Apple engineers used the company's self-developed framework software and a variety of hardware when building basic models, including the tensor processing unit (TPU) that is only available on Google Cloud. However, Apple did not disclose how much Apple relies on Google's chips and software compared to other AI hardware suppliers such as Nvidia.

Therefore, a comment on social media X on Monday pointed out that news of Apple using Google chips came out in June, and now we have more details about the training stack.


Some comments say that Apple doesn't hate Nvidia, it's just that the TPU is faster. Other comments say that the TPU is faster, so it makes sense for Apple to use it, and of course it may be cheaper than Nvidia's chips.


Media comments on Monday said that Google's TPU was originally created for internal workloads and is now being used more widely. Apple's decision to train models with Google chips shows that some technology giants may be looking for and have found alternatives to Nvidia's AI chips when it comes to AI training.

Wall Street Journal mentioned that last week, Zuckerberg, CEO of Meta, and Pichai, CEO of Alphabet and Google, both hinted in their speeches that their companies and other technology companies may have overinvested in AI infrastructure and "may have invested too much in AI." But they also admitted that the business risks would be too high if they did not do so.

Zuckerberg said:

“The consequence of falling behind is that you will be at a disadvantage in the most important technologies for the next 10 to 15 years.”

Pichai said:

AI is expensive, but the risk of underinvestment is greater. Google may have invested too much in AI infrastructure, mainly by buying GPUs from Nvidia. Even if the AI ​​boom slows, the data centers and computer chips the company has purchased can be put to other uses. To us, the risk of underinvestment is far greater than the risk of overinvestment.