news

Meta releases the most powerful open source model Llama 3.1, Zuckerberg: It will be a turning point for the industry

2024-07-24

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

On the evening of July 23rd, Beijing time, Meta officially released the latest open source large model Llama 3.1 series, further narrowing the gap between open source models and closed source models. Llama 3.1 includes three parameter sizes: 8B, 70B, and 450B. The 450B parameter model surpassed OpenAI's GPT-4o in multiple benchmarks and was comparable to leading closed source models such as Claude 3.5 Sonnet.


At the same time, Meta founder and CEO Zuckerberg published a blog on the official website to promote the release. He said that Llama version 3.1 will be a turning point in the industry, and most developers will begin to use mainly open source. Open source AI is the future development direction.

Jim Fan, senior research scientist at Nvidia, congratulated the Meta team in a post on X, saying, “The power of GPT-4 is in our hands. This is a truly historic moment.”

In terms of specific details, the model context window of the three versions of Llama 3.1 has increased from 8k to 128K, a 16-fold increase, and supports 8 languages ​​at the same time. The Llama 3.1-405B model uses more than 15 trillion tokens for training, and in order to achieve this training scale, the team used 16,000 H100 GPUs. Officials said that the 405B model is the first Llama model trained at this scale.

Open source large language models have mostly lagged behind closed source models in terms of functionality and performance, "but now, we are entering a new era led by open source."

In the official blog, Meta evaluated the performance of more than 150 benchmark datasets and compared the performance of Llama 3.1 with other models. The flagship model Llama 3.1 -405B is comparable to GPT-4, GPT-4o and Claude 3.5 Sonnet in a series of tasks such as common sense, operability, and mathematics. In addition, the 8B and 70B small models are competitive with closed-source and open-source models with similar numbers of parameters.


In real-world scenarios, Llama 3.1 405B is compared with human evaluation and outperforms GPT-4o and Claude 3.5 Sonnet overall.


Meta also updated the open source license, allowing developers to use the output of Llama models (including 405B) to improve other models for the first time. In comparison with GPT-4o, the official said that they will also use a combination of image, video and voice functions into Llama 3, so that the model can recognize images and videos and support interaction through voice. However, this feature is still under development and is not yet ready for release.

In an official blog post, Meta said that the total downloads of all Llama versions to date have exceeded 300 million times.

In addition to the release of this model, Zuckerberg also published a long article on the official website titled "Open Source AI Is the Path Forward", in which he mentioned the importance of open source. He believes that open source is good for all developers, Meta and the world.


Zuckerberg cited the example of the victory of the open source system Linux over the closed source system Unix, and believed that artificial intelligence would develop in a similar way. "Several technology companies are developing leading closed models, but open source is quickly narrowing the gap." He mentioned that last year, Llama 2 was only comparable to the older generation of models. This year, Llama 3 is competitive in some areas and even ahead of the most advanced models in some aspects.

Zuckerberg believes that open source can promote innovation, reduce costs and improve security. For developers, open source can be used to train, fine-tune and distill their own models. Each organization has different needs, and it is best to use models of different sizes to meet these needs. These models are trained or fine-tuned through specific data.

At the same time, developers can avoid being locked into closed vendors and protect data security. "Open source software is often safer because its development is more transparent and can be widely reviewed," Zuckerberg said.

Zuckerberg also mentioned that open source models are cheaper and more efficient. Developers can run reasoning on Llama 3.1 405B on their own infrastructure at a cost of about 50% of using closed models like GPT-4o, which is suitable for user interfaces and offline reasoning tasks.

"Open source AI represents the world's best opportunity." In Zuckerberg's view, using this technology can create the greatest economic opportunities and security.