news

NVIDIA releases new AI model with 8 billion parameters: high accuracy and efficiency, deployable on RTX workstations

2024-08-23

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

IT Home reported on August 23 that NVIDIA published a blog post on August 21, releasing the Mistral-NeMo-Minitron 8B small language AI model, which has the advantages of high precision and high computing efficiency, and can run the model on GPU-accelerated data centers, clouds, and workstations.

NVIDIA and Mistral AI released the open source Mistral NeMo 12B model last month. Based on this, NVIDIA launched the smaller Mistral-NeMo-Minitron 8B model with a total of 8 billion parameters, which can run on workstations equipped with NVIDIA RTX graphics cards.

NVIDIA said that it obtained Mistral-NeMo-Minitron 8B after width-pruning Mistral NeMo 12B and lightly retraining it through knowledge distillation. The relevant results were published in the paper "Compact Language Models via Pruning and Knowledge Distillation".

Pruning shrinks a neural network by removing the model weights that contribute least to accuracy. In the process of "distillation," the team retrained the pruned model on a small dataset to significantly improve the accuracy that was reduced by the pruning process.

Given its size, Mistral-NeMo-Minitron 8B leads the way in nine popular benchmarks for language models. These benchmarks cover a variety of tasks, including language understanding, common sense reasoning, mathematical reasoning, summarization, encoding, and the ability to generate real answers. IT Home attaches the relevant test results as follows: