news

Mistral AI releases two new releases: 7B dedicated to mathematical reasoning, Mamba2 architecture code large model

2024-07-17

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Machine Heart Report

Synced Editorial Department

Netizens are very curious whether Mathstral can answer the question of "Which is greater, 9/11 or 9/9?"

Yesterday, the AI ​​community was overwhelmed by a simple question like "Which is bigger, 9.11 or 9.9?" Large language models including OpenAI GPT-4o and Google Gemini all failed.





This shows us that large language models cannot understand and give correct answers like humans when dealing with some numerical problems.

For numerical and complex mathematical problems, specialized models are more specialized.

Today, French model unicorn Mistral AI released a7B large model "Mathstral" focusing on mathematical reasoning and scientific discovery, to solve advanced math problems that require complex, multi-step logical reasoning.

The model is built on Mistral 7B, supports a context window length of 32k, and follows the open source protocol Apache 2.0 license.

Mathstral was built to strike an excellent balance between performance and speed, a development philosophy that Mistral AI actively promotes, especially with regard to fine-tuning capabilities.



Meanwhile, Mathstral is a command-based model that can be used or fine-tuned. The model weights have been placed on HuggingFace.

  • Model weights: https://huggingface.co/mistralai/mathstral-7B-v0.1

The figure below shows the difference in MMLU performance between Mathstral 7B and Mistral 7B by subject.

Mathstral achieves SOTA reasoning performance in its size range on various industry standard benchmarks. In particular, it achieves a pass rate of 56.6% on the MATH dataset and a pass rate of 63.47% on MMLU.



Meanwhile, Mathstral’s pass rate on MATH (56.6%) is more than 20% higher than Minerva 540B. In addition, Mathstral scored 68.4% on MATH with majority voting @64 and 74.6% using the reward model.



This result also made netizens curious about whether Mathstral can solve the question of "which is greater, 9.11 or 9.9?"



Codestral Mamba



  • Model weights: https://huggingface.co/mistralai/mamba-codestral-7B-v0.1

Along with Mathstral 7B, there is also a Codestral Mamba model specifically for code generation, which uses the Mamba2 architecture and also follows the Apache 2.0 license open source agreement. This is a guidance model with more than 7 billion parameters that researchers can use, modify and distribute for free.

It is worth mentioning that Codestral Mamba was designed with the help of Mamba authors Albert Gu and Tri Dao.

The Transformer architecture has long supported half of the AI ​​field. However, unlike the Transformer, the Mamba model has the advantage of linear time reasoning and can theoretically model sequences of unlimited length. The architecture allows users to interact with the model extensively and respond quickly without being limited by the length of the input. This efficiency is especially important for code generation.

In benchmark tests, Codestral Mamba outperformed competing open source models CodeLlama 7B, CodeGemma-1.17B, and DeepSeek in the HumanEval test.



Mistral tested the model, which is available for free on Mistral’s la Plateforme API, on inputs of up to 256,000 tokens — twice as many as OpenAI’s GPT-4o.

With the release of Codestral Mamba, some netizens started using it in VSCode and found it very smooth.