news

Zuckerberg reveals that he has invested heavily in training Llama 4, with 240,000 GPUs working together! Expected to be released in 2025

2024-08-05

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina


New Intelligence Report

Editor: Peach

【New Wisdom Introduction】Unexpectedly, multimodal Llama 4 has already started training. Zuckerberg said that he would invest ten times the computing power of Llama 3 to train the model, which is expected to be released in 2025. He spent a lot of money to buy GPUs because he was afraid that it would be too late.

Llama 3.1 was just released and Llama 4 is already fully in training.

These days, Zuckerberg said at the second quarter earnings conference that Meta will use ten times the computing power of Llama 3 to train the next-generation multimodal Llama 4, which is expected to be released in 2025.


This bill, Lao Huang became the biggest winner again

What does ten times the computing power mean?

It should be noted that Llama 3 was trained on two clusters with 24,000 GPUs. In other words, Llama 4 training will use 240,000 GPUs.



So, does Meta have enough inventory?

Remember that Zuckerberg announced at the beginning of the year that he planned to deploy 350,000 NVIDIA H100s by the end of the year.

He also revealed more details that Meta will build two single clusters for training LLM, one equipped with 22,000 H100s and the other with 24,000.


Some netizens have reviewed how GPU usage increased during the iteration of the Llama model:

Llama 1: 2048 GPUs

Llama 2: 4096 GPUs

Llama 3.1: 16,384 GPUs


Zuckerberg may spend $40 billion, but he doesn't want it to be too late

It can be seen that training large models is a costly business.

The financial report shows that Meta's capital expenditure on servers, data centers and network infrastructure increased by nearly 33% in the second quarter.

It increased from US$6.4 billion in the same period last year to US$8.5 billion.

Annual spending is now expected to reach $37 billion to $40 billion, up from a previous estimate of $35 billion to $40 billion.


A report from The Information said OpenAI spent $3 billion on training its models and another $4 billion renting servers from Microsoft at a discount.

It can be seen how ironic it is to spend so much computing power to train large models.

However, the significance lies in that the open source of Llama 3.1 will become an important turning point for the AI ​​industry, and open source artificial intelligence will begin to become an industry standard like Linux.

Meta is planning computing clusters and data centers over the next few years to support future generations of AI models.

Zuckerberg admitted that it is difficult to predict the future development path of AI technology, but infrastructure cannot be built overnight.

Despite the uncertainty, I was willing to take the risk of building up ahead of time rather than fall behind my competitors due to lack of preparation.


Zuckerberg's foresight has led the company to stand out in the wave of the metaverse.

When the company's stock price suffered a heavy blow in 2022, Zuckerberg still took a risk and bought a large number of H100.

The third quarter financial report of that year showed that Meta's capital expenditure was as high as US$32 billion to US$33 billion.

Most of this money goes into building data centers, servers, and network infrastructure, as well as huge investments in the metaverse.

In the interview, Zuckerberg explained, "At the time, Meta was vigorously developing the short video tool Reels, so more GPUs were needed to train the model."

Because model inference is a crucial matter for Meta, it needs to provide services to users of its own applications such as Facebook and Instagram.

In Zuckerberg's own words:

The ratio of inference computing to training that we need is probably much higher than other companies working in this field because the community of users we serve is so large.

An AI agent for every person

Some time ago, Meta AI scientist Thomas Scialom also mentioned in a blog interview that Llama 4, which had started training in June.

He said that the new model may focus on intelligent agent technology, and some research has been done on agent tools such as Toolformer.


Zuckerberg believes that AI agents will soon become the "standard configuration" for online companies.

“Over time, I think that just like every business has a website, social media accounts and email addresses, every business will have an AI agent that customers can interact with.”

Meta's goal is to make it easy for every small business, and eventually every large business, to integrate their content and products into AI agents.

When this technology is put into real-world applications on a large scale, it will greatly accelerate our business information revenue.


Despite criticism from investors over Meta’s high spending on AI and the Metaverse, Zuckerberg is sticking with his strategy.

While VR has seemed to take a backseat in Meta's recent quarters, Zuckerberg did mention that Quest 3 sales have exceeded the company's expectations.

Second-quarter figures showed revenue in this area grew 22% to $39.1 billion, and profits increased 73% to $13.5 billion.

For the third quarter, Meta expects revenue between $38.5 billion and $41 billion.

Sources say the company will announce a cheaper pair of headphones at its Connect conference in September.

In addition, the AI ​​assistant Meta AI is becoming increasingly popular, and Zuckerberg said it is expected to become the most widely used AI assistant by the end of the year.


References:

https://the-decoder.com/meta-plans-to-use-10-times-more-compute-power-to-train-its-next-generation-lama-4-ai-model/

https://www.theverge.com/2024/7/31/24210786/meta-earnings-q2-2024-ai-llama-zuckerberg