news

Llama 3.1, "European OpenAI" releases new open source model Large 2 | Frontline

2024-07-26

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Author | Wang Yixin

Editor |

Less than a day after Meta released its latest open source large model Llama 3.1, French AI startup Mistral came to challenge - on July 24, Mistral released its new flagship model Large 2.

Mistral AI is a French AI startup that was founded just over a year ago. It is also the best-funded and most competitive AI player in Europe to date. Its core members come from top AI institutions such as Google DeepMind. For example, Mensch is the author of large model papers such as Chinchilla and proposed core technologies including Scaling Laws.

Just four weeks after its establishment in June 2023, Mistral AI raised 105 million euros with a team of six. The company focuses on the research and development of open source large models and was praised by French President Emmanuel Macron as "a model for the new generation of European startups to compete with American technology giants."

In December 2023, Mistral released an open source large model called Mistral 8x7B, which has 56 billion parameters and is comparable to LLaMA-65B in efficiency and performance, making it an instant success in the large model industry. In addition, the company also benchmarked ChatGPT and launched a multilingual conversation assistant called Le Chat (the official website shows that it is still in the testing phase and requires registration and application for testing qualifications) to demonstrate the company's latest technical capabilities.

Mistral said that Large 2 surpasses Llama 3.1 405B in code generation, math, and reasoning capabilities, using less than one-third of the parameters of Llama 3.1 405B, or 123 billion parameters. It is also more concise than other leading AI models when generating responses, avoiding excessive lengthy descriptions. This means that Large 2 has a greater cost advantage and developers can run it faster locally.

Like Meta's Llama 3.1, Large 2 does not have multimodal capabilities, but it can "win big with a small investment" in terms of accuracy and reliability of dialogue responses. Mistral said that the model's hallucination problem was one of the focuses of Large 2's training process. In addition, Large 2 has also improved in command following and dialogue tasks, handling precise commands, and long, multi-round dialogues.

Large 2 has a 128k context length, which means it can receive about the same number of characters as a 300-page book in a single conversation. In addition, Large 2 supports multiple languages, including English, French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese and Korean, as well as 80 code languages.


Image source: Mistral official website

It should be pointed out that Mistral's model is not an open source model in the traditional sense and commercial use requires payment.

Currently, Large 2 has been installed on the platforms of Google, Amazon, Azure and IBM for users to use. Users can also experience it through "mistral-large-2407" on Mistral's La Plateforme (a comprehensive platform that simplifies AI application development, providing pre-trained models, data processing tools and API interfaces), or test it for free on Le Chat.

Mistral completed its Series B financing in June this year, raising a total of US$640 million at a valuation of US$6 billion. This round of financing was led by General Catalyst, and investors also included Lightspeed Venture Partners, Andreessen Horowitz, Nvidia, Samsung Venture Investment and IBM.

Mistral AI currently has about 60 employees, 45 of whom are in France, 10 in the United States, and 5 in the United Kingdom. According to the Financial Times, about three-quarters of the employees are engaged in product development and research.