Open source = the most powerful model! Llama 3.1 released, 405B surpasses closed source GPT-4o, Zuckerberg: watershed moment

2024-07-24

Baijiao from Aofei Temple

Quantum Bit | Public Account QbitAI

LIama 3.1Officially released, ascending the throne of large models!

In more than 150 benchmark test sets, the performance of the 405B version is equal to or even exceeds the existing SOTA modelGPT-4oand Claude 3.5 Sonnet.

That is to say, this time,The strongest open source model is the strongest model。

Before this, Llama 3.1 had been exposed and leaked many times, and now it can be said that it has finally come out after much anticipation.

Starting today, the model can be downloaded and used on the official website, and the Meta AI application can be tried online.

What is even more appreciated by the research community is the release of a nearly 100-page detailed paper covering everything that went into creating Llama 3.1: pre-training data, filtering, annealing, synthetic data, scaling laws, infrastructure, parallelism, training recipes, post-training adaptation, tool usage, benchmarks, inference strategies, quantization, vision, speech, video, etc.

The chief scientist of HuggingFace exclaimed: If you are studying large models from scratch, start with this paper.

XiaozaZuckerbergIn the latest interview with Bloomberg, he specifically mockedOpenAI。

Altman’s leadership is commendable, but it’s a bit ironic that a company called OpenAI has become a leader in building closed AI models.

Zuckerberg also wrote a long article specifically for this purpose:Open source AI is the way forward。

In the past, open source models mostly lagged behind closed source models in terms of performance, functionality, etc., but now:

Just like open source Linux stands out from a crowd of closed source systems and gains popularity, and gradually becomes more advanced and more secure, with a broader ecosystem than closed source systems.

I believe Llama 3.1 will be a turning point for the industry.

To date, the total downloads of all Llama versions have exceeded 300 million times, and Meta also made a bold statement:

This is just the beginning.

Major cloud vendors also launched support for Llama 3.1 at the earliest opportunity, with prices like this:

LIama 3.1 is officially released

First, let’s look at model capabilities.

Llama 3.1 extends the context length to 128K and adds support for eight more languages.

Among them, the super-large cup 405B version has caught up with and surpassed the existing top models in terms of common sense, maneuverability, mathematics, tool use and multi-language translation.

In addition, upgraded versions of the 8B and 70B models have also been launched, with capabilities basically on par with top models with equivalent parameters.

Let's look at it againModel Architecture。

According to the official introduction, it is quite challenging to train the Llama 3.1 405B model on more than 15 trillion tokens.

To this end, they significantly optimized the entire training stack and expanded the model computing power to more than 16,000 H100 GPUs for the first time.

Specifically, the standard decoder-onlyTransformerarchitecture and make some minor changes; and adopt an iterative post-training process, with SFT (supervised fine-tuning) and DPO (direct preference optimization) in each round to improve the performance of each capability.

Compared to previous versions of Llama, they improved the amount and quality of data used for pre-training and post-training.

In order to support large-scale production inference of models of this size, Meta quantized the model from 16 bits (BF16) to 8 bits (FP8), effectively reducing the required computing requirements and allowing the model to run within a single server node.

existInstruction fine-tuningIn terms of security, Meta also improves the model's responsiveness to user instructions and enhances its ability to follow detailed instructions while ensuring security.

In the post-training stage, Meta performs multiple rounds of alignment based on the pre-trained model.

Each round includes Supervised Fine-Tuning (SFT), Rejection Sampling (RS), and Direct Preference Optimization (DPO).

They used synthetic data to generate most of the SFT examples and iterated several times.

In addition, several data processing techniques are employed to filter these synthetic data to the highest quality.

A total of 15T tokens were cleaned and filtered using the Llama 2 model, while the code and mathematics-related data processing pipelines mainly borrowed from Deepseek's methods.

In addition to the most basic response to prompt words, Meta officials said that any ordinary developer can use it to do some advanced things, such as:

Real-time and batch inference

Supervised fine-tuning

Evaluate models for specific applications

Continuous pre-training

Retrieval Enhancement Generation (RAG)

Function Call

Synthetic Data Generation

This is also supported by its powerful ecological partners.

Zuckerberg writes a long article: Open source AI is the way forward

(The following is translated by Damo Model, extracting the main content. If there are any omissions or errors, please correct them!)

In the early days of high-performance computing, the big tech companies of the time invested heavily in developing their own closed-source versions of Unix. At the time, it was hard to imagine that there was any other way to produce such advanced software besides closed source. However, the open source Linux operating system eventually won wide popularity - initially because it allowed developers to modify the code freely and at a lower cost; over time, Linux not only became more advanced and secure, but also built an ecosystem that was broader than any closed-source Unix system and supported more features. Today, Linux has becomecloud computingand the industry-standard foundation for most mobile device operating systems, and we all enjoy better products as a result.

I believe AI will develop in a similar wayToday, several tech companies are developing leading closed-source models. But open source is closing the gap quickly. Last year, Llama 2 was only comparable to models that were one generation behind. This year, Llama 3 is competitive with state-of-the-art models and ahead in some areas. Starting next year, we expect future Llama models to be the most advanced in the industry. But even before then, Llama is already leading in openness, modifiability, and cost-efficiency.

Today, we are moving towards“Open source AI becomes industry standard”We released Llama 3.1 405B, the first open-source AI model at the cutting-edge level, and improved Llama 3.1 70B and 8B models. In addition to having a significantly better cost/performance ratio compared to closed-source models, the openness of the 405B model will make it an excellent choice for fine-tuning and distilling smaller models.

In addition to releasing these models, we’re working with a range of companies to grow the broader ecosystem. Amazon, Databricks, and NVIDIA are launching a full suite of services that enable developers to fine-tune and distill their own models. Innovators like Groq have built low-latency, low-cost inference services for all new models. These models will be available on all major cloud platforms, including AWS, Azure, Google, Oracle, and others. Companies like Scale.AI, Dell, Deloitte, and others are ready to help enterprises adopt Llama and train custom models with their own data. As the community grows and more companies develop new services, we can together make Llama the industry standard and bring the benefits of AI to everyone.

Meta is committed to open source AI. I will outline why I believe open source is the best development stack, why open source Llama is good for Meta, and why open source AI is good for the world and is therefore a long-term sustainable platform.

Why open source AI is good for developers

When I talk to developers, CEOs, and executives around the world, I typically hear a few themes:

We need to train, fine-tune, and distill our own models. . Every organization has unique needs and is best suited to using models of different sizes that can be trained or fine-tuned on their specific data. For on-device and classification tasks, small models are sufficient, while for more complex tasks, larger models are required. Now, you can take state-of-the-art Llama models, continue training them on your own data, and then distill them to a model size that best suits your needs - without letting us or anyone else see your data.

We need to control our own destiny and not be locked into closed source vendorsMany organizations don’t want to rely on models they can’t run and control themselves. They don’t want closed-source model providers to be able to change the model, modify the terms of use, or even stop the service entirely. They also don’t want to be locked into having exclusive use of the model on just one cloud platform. Open source allows a broad ecosystem of companies to have compatible toolchains, allowing you to easily migrate between them.

We need to keep our data safeMany organizations handle sensitive data that needs to be protected and cannot be sent through a closed-source model's cloud API. Others simply don't trust closed-source model providers to handle their data. Open source solves these problems by allowing you to run models anywhere. It is generally believed that open source software is generally more secure because its development process is more transparent.

We need a model that is efficient and affordableDevelopers can run Llama 3.1 405B inference on their own infrastructure, for both user-facing and offline inference tasks, at about half the cost of using closed-source models such as GPT-4o.

We want to invest in an ecosystem that will become a long-term standardMany see that open source is evolving faster than the closed source model, and they want to build their systems on the architecture that will give them the greatest long-term advantage.

Why open source AI is good for Meta

Meta’s business model is to create the best experiences and services for people. To do this, we must ensure that we always have access to the best technology and are not locked into a competitor’s closed-source ecosystem, limiting our ability to innovate.

One of the key lessons I learned was that our services were constrained by Apple's restrictions on what we could build on its platform. From the way they tax developers, to the rules they arbitrarily apply, to all the product innovation they prevent from being released, it's clear that Meta and many other companies would be able to provide better services to people if we could build the best version of our products and competitors couldn't limit our innovation. Philosophically, this is the main reason why I firmly believe in building open ecosystems for the next generation of computing in AI and AR/VR.

People often ask me if I’m worried about giving up technical advantages by open sourcing Llama, but I think that misses the big picture for several reasons:

First, to ensure we have access to the best technology and are not locked into a closed source ecosystem in the long term, Llama needs to grow into a full ecosystem of tools, including efficiency improvements, silicon optimizations, and other integrations. If we are the only company using Llama, this ecosystem will not grow and we will not perform better than closed source versions of Unix.

Second, I expect AI development to continue to be very competitive, which means that open sourcing any particular model will not give a significant advantage over the next best model at the time. Llama’s path to becoming an industry standard is through continued competitiveness, efficiency, and openness, generation after generation.

Third, a key difference between Meta and closed source model providers is that selling access to AI models is not our business model. This means that publicly releasing Llama does not undermine our revenue, sustainability, or ability to invest in research, which is not the case for closed source providers.

Finally, Meta has a long history of open source projects and success. We have saved billions of dollars through the Open Compute Project by publishing server, network, and data center designs and having the supply chain standardize on our designs. We benefit from the innovation of the ecosystem by open sourcing leading tools such as PyTorch, React, and others. This approach has always worked for us over the long term.

Why open source AI is good for the world

I believe open source is essential to achieving a positive AI future. AI has more potential than any other modern technology to enhance human productivity, creativity, and quality of life—and accelerate economic growth while driving advances in medicine and scientific research. Open source will ensure that more people around the world have access to the benefits and opportunities of AI, that power is not concentrated in the hands of a few companies, and that the technology can be deployed more evenly and safely across society.

There is an ongoing debate about the security of open source AI models, and my view is that open source AI will be safer than the alternatives.

I understand the safety framework to be that we need to guard against two types of harm: unintentional and intentional. Unintentional harm is when an AI system could cause harm, even if the people running it don’t intend to do so. For example, modern AI models could inadvertently give bad health recommendations. Or, in more futuristic scenarios, some worry that models could inadvertently replicate themselves or over-optimize for goals that harm humans. Intentional harm is when a bad actor uses an AI model with the intent to cause harm.

It’s worth noting that unintentional harm encompasses most of the concerns people have about AI—from what impact AI systems will have on the billions of people who use them to most of the science fiction scenarios that are truly catastrophic for humanity. In this regard, open source should be safer because the systems are more transparent and can be widely reviewed. Historically, open source software has been safer for this reason. Similarly, using Llama and its security systems like Llama Guard will likely be safer and more reliable than a closed-source model. As a result, most conversations about open source AI safety focus on intentional harm.

Our security process includes rigorous testing and red teaming to assess whether our models have the ability to cause significant harm, with the goal of mitigating risk before release. Since the models are open, anyone can test them for themselves. We must remember that these models are trained on information that is already on the web, so when considering harm, the starting point should be whether the model can facilitate more harm than information that can be quickly retrieved from Google or other search results.

As you think about future opportunities, remember that most of today’s leading technology companies and scientific research are built on open source software. If we invest together, the next generation of companies and research will use open source AI.

Most importantly, open source AI represents the world’s best chance to harness this technology to maximize economic opportunity and security for everyone.

Let's build together

With the past Llama model, Meta developed and released it on its own, but didn't pay much attention to building the broader ecosystem. We're taking a different approach with this launch. We're building a team internally to make Llama available to as many developers and partners as possible, and we're also actively building partnerships so that more companies in the ecosystem can also provide unique capabilities to their customers.

I believeThe release of Llama 3.1 will be a turning point for the industry, most developers will begin using primarily open source, and I expect this approach will only grow from now on. I hope you will join us on our journey to bring the benefits of AI to everyone in the world.

Latest interview links:

https://x.com/rowancheung/status/1815763595197616155

Reference Links:

[1]https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/

[2]https://ai.meta.com/blog/meta-llama-3-1/

news