news

iPhone can run 2B small steel cannon! Google Gemma 2 is coming, the most powerful microscope dissects the LLM brain

2024-08-01

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina


New Intelligence Report

Editor: Editorial Department

【New Wisdom Introduction】Google DeepMind's small model nuclear bomb is here. Gemma 2 2B directly beats GPT-3.5 and Mixtral 8x7B, which have several orders of magnitude larger parameters! Gemma Scope, released at the same time, breaks the LLM black box like a microscope, allowing us to see clearly how Gemma 2 makes decisions.

Google DeepMind's small model has a new one!

Just now, Google DeepMind released Gemma 2 2B.



It is distilled from Gemma 2 27B.

Although its parameters are only 2.6B, its score in the LMSYS arena has surpassed GPT-3.5 and Mixtral 8x7B!


In the MMLU and MBPP benchmarks, it achieved excellent scores of 56.1 and 36.6 respectively; its performance is more than 10% better than the previous generation model Gemma 1 2B.

The small model beat the large model that was several orders of magnitude larger, once again confirming the small model direction that the industry has been very optimistic about recently.


Today, Google announced three new members of the Gemma 2 family:

  • Gemma 2 2B:Lightweight 2B model that offers the best balance between performance and efficiency

  • ShieldGemma:A secure content classifier model built on Gemma 2 to filter the input and output of AI models to ensure user safety

  • Gemma Scope:An explainability tool that provides unparalleled insight into the inner workings of a model

In June, the 27B and 9B Gemma 2 models were born.

Since its release, the 27B model has quickly become one of the top open source models on the large model rankings, and has even outperformed popular models with twice the number of parameters in actual conversations.


Gemma 2 2B: Ready to use on your device

The lightweight small model Gemma 2 2B is a distillation of the large model, with no less performance.

On the large model arena LMSYS, the new model achieved an impressive score of 1130, comparable to a model with 10 times the parameters.

GPT-3.5-Turbo-0613 scored 1117 and Mixtral-8x7b scored 1114.


It can be seen that Gemma 2 2B is the best end-to-end model.


Some netizens ran the quantized Gemma 2 2B on MLX Swift on an iPhone 15 Pro, and the speed was astonishingly fast.



Specifically, it can be deployed on a variety of terminal devices, including mobile phones, laptops, and even the powerful cloud using Vertex AI and Google Kubernetes Engine (GKE).

To speed up the model, it is optimized through NVIDIA TensorRT-LLM and is also available on the NVIDIA NIM platform.


The optimized models are suitable for deployment on various platforms, including data centers, clouds, local workstations, PCs, and edge devices.

It can also support RTX, RTX GPU, and Jetson modules to complete edge AI deployment.

In addition, Gemma 2 2B seamlessly integrates Keras, JAX, Hugging Face, NVIDIA NeMo, Ollama, Gemma.cpp, etc., and will soon be integrated with MediaPipe to simplify development.


Of course, like Gemma 2, the 2B model can also be used for research and commercial purposes.

In fact, due to its sufficient number of parameters, it can run on the free T4 GPU layer of Google Colab, lowering the development threshold.

Currently, every developer can download Gemma 2's model weights from Kaggle, Hugging Face, and Vertex AI Model Garden, and can also try out its functions in Google AI Studio.


Warehouse address: https://huggingface.co/collections/google/gemma-2-2b-release-66a20f3796a2ff2a7c76f98f

ShieldGemma: A State-of-the-Art Security Classifier

As the name suggests, ShieldGemma is a state-of-the-art security classifier that ensures AI output is attractive, safe, inclusive, and detects and reduces harmful content output.

ShieldGemma is designed to specifically target four key areas of harm:

- Hate Speech

- Harassment

- Explicit Content

- Dangerous Content


These open source classifiers complement Google’s existing suite of safe classifiers in the Responsible AI Toolkit.

The toolkit includes a method for building policy-specific classifiers based on limited data points, as well as ready-made Google Cloud classifiers available through an API.

ShieldGemma is built on Gemma 2, the industry-leading security classifier.

It provides various model parameter sizes, including 2B, 9B, and 27B, all of which have been speed optimized by NVIDIA and can run efficiently on various hardware.

Among them, 2B is very suitable for online classification tasks, while 9B and 27B versions provide higher performance for offline applications with lower latency requirements.


Gemma Scope: Uncovering AI decision-making processes through open source sparse autoencoders

Another highlight of this release is the open source sparse autoencoder - Gemma Scope.

What exactly happens inside a language model? This question has long puzzled researchers and developers.

The inner workings of language models are often a mystery, even to the researchers who train them.


Gemma Scope is like a powerful microscope that zooms in on specific points in the model through sparse autoencoders (SAEs), making the inner workings of the model easier to explain.

With Gemma Scope, researchers and developers gain unprecedented transparency into the decision-making process of Gemma 2 models.

Gemma Scope is a collection of hundreds of free and open Sparse Autoencoders (SAEs) for Gemma 2 9B and Gemma 2 2B.

These SAEs are specially designed neural networks that help us interpret the dense, complex information processed by Gemma 2 and expand it into a form that is easier to analyze and understand.

By studying these expanded views, researchers can gain valuable insight into how Gemma 2 recognizes patterns, processes information, and makes predictions.

With Gemma Scope, the AI ​​community can more easily build more understandable, responsible, and reliable AI systems.

At the same time, Google DeepMind also released a 20-page technical report.


Technical report: https://storage.googleapis.com/gemma-scope/gemma-scope-report.pdf

In summary, Gemma Scope has the following three innovations:

  • Open Source SAEs: Over 400 freely available SAEs covering all layers of Gemma 2 2B and 9B

  • Interactive Demo: Explore SAE capabilities and analyze model behavior on Neuronpedia without writing code

  • Easy-to-use repository: provides code and examples for interacting with SAEs and Gemma 2

Understanding the inner workings of language models

The interpretability of language models: why is it so difficult?

This starts with the operating principle of LLM.

When you ask an LLM a question, it converts your text input into a series of "activations." These activations map the relationships between the words in your input, helping the model make connections between different words and generate answers based on them.

As the model processes text input, the activations of different layers in the model’s neural network represent multiple, progressively higher-level concepts, which are called “features.”


For example, early layers of the model might learn facts like Jordan plays basketball, while later layers might recognize more complex concepts like the authenticity of a text.


Example of interpreting model activations with a sparse autoencoder - how the model recalls the fact that "the City of Lights is Paris". You can see that concepts related to the French language are present, while unrelated concepts are absent

However, interpretability researchers have been faced with a key problem: the activations of a model are a mixture of many different features.

Early in their research, researchers hoped to align features in neural network activations with individual neurons, or information nodes.

But unfortunately, in practice, neurons are active for many irrelevant features.

This means that there is no obvious way to tell which features are part of the activation.

And this is exactly where sparse autoencoders come in.

Remember that a particular activation will only be a mixture of a few features, even though a language model may be able to detect millions or even billions of features (that is, the model uses features sparsely).

For example, a language model might think of relativity when answering a question about Einstein and eggs when writing about omelets, but might not think of relativity when writing about omelets.


Sparse autoencoders take advantage of this fact to discover a set of latent features and decompose each activation into a small number of features.

The researchers hope that the best way for sparse autoencoders to accomplish this task is to find the essential features that language models actually use.

Importantly, during this process, the researchers did not tell the sparse autoencoder which features to look for.

As a result, they were able to discover rich structures that had not been expected before.


However, because they don’t immediately know the exact meaning of the discovered features, they look for meaningful patterns in the text examples that the sparse autoencoder thinks the features “trigger.”


Here is an example where the tokens that triggered the feature are highlighted with a blue gradient depending on the strength of the feature trigger:


Example of discovering feature activations using a sparse autoencoder. Each bubble represents a token (word or word fragment), and the variable blue color indicates the strength of the feature. In this example, the feature is clearly related to idioms.

What is unique about Gemma Scope?

Compared with previous sparse autoencoders, Gemma Scope has many unique features.

The former mainly focuses on studying the inner workings of small models or a single layer of a large model.


But going deeper into interpretability research involves decoding the layered and complex algorithms in large models.

This time, researchers at Google DeepMind trained sparse autoencoders on the output of each layer and sublayer of Gemma 2 2B and 9B.

The Gemma Scope constructed in this way generates a total of more than 400 sparse autoencoders and obtains more than 30 million features (although many features may overlap).

This allows researchers to study how features evolve throughout the model and how they interact and combine to form more complex features.

In addition, Gemma Scope was trained using the latest, state-of-the-art JumpReLU SAE architecture.

The original sparse autoencoder architecture often has difficulty balancing the two goals of detecting feature presence and estimating strength. The JumpReLU architecture can more easily achieve the balance between the two and significantly reduce the error.


Of course, training so many sparse autoencoders is also a major engineering challenge and requires a lot of computing resources.

In the process, the researchers used about 15% of the Gemma 2 9B training compute (excluding the computation required to generate distilled labels), saved about 20 PiB of activations to disk (roughly equivalent to one million copies of English Wikipedia), and generated a total of hundreds of billions of sparse autoencoder parameters.

References:

https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma/