news

How does chain thinking stimulate large-scale model arithmetic reasoning? Scientists give the answer from the perspective of neuronal activation

2024-08-03

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

The big model has received a lot of attention in the past year or two, especially its performance in solving arithmetic math problems.

In fact, as early as 2022, researchers from the Google Research team proposed the Chain-of-Thought (CoT) prompt, a prompt engineering method that can effectively improve the mathematical reasoning of large models, and verified its effectiveness in few-shot contextual learning [1].

Although the method quickly gained widespread application, researchers in the field still knew little about how it stimulated the arithmetic reasoning ability of large models.

Previously, existing related explorations mainly focused on experimentally observing the impact of different components in CoT prompt sentences on the arithmetic reasoning effects of large models.

Specifically, we try to replace or remove components in the CoT prompt sentences, such as removing the textual reasoning part in the CoT sample, leaving only the key mathematical formulas, and observe the performance difference of the large model on the existing arithmetic reasoning benchmark before and after replacement or removal to determine whether the replaced or removed part has an important contribution to stimulating the arithmetic reasoning ability of the large model.

Although researchers in this field have discovered several interesting phenomena from these studies, they still cannot explain how CoT stimulates the arithmetic reasoning ability of large models from the internal mechanism of neural networks.

At the same time, these studies have also raised more questions, such as why different components of CoT have different degrees of impact on arithmetic reasoning in large models.

To solve the above problems, Professor Yao Ziyu and his team at George Mason University in the United States conducted a series of explorations on the open source Llama2 model from the perspective of "model interpretability", and proposed using "neuronal activation" to systematically explain the phenomena observed in existing studies on CoT.


Figure丨Members of the research team (Source: Research team)

Recently, a related paper titled “An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs” was accepted by the Annual Meeting of the Association for Computational Linguistics (ACL) 2024 [2].

Daking Rai, a doctoral student at George Mason University, is the first author and Ziyu Yao is the corresponding author.


Figure 丨 Related papers (Source: ACL 2024)

In the study, they first focused on exploring whether the Transformer feedforward layer contained neurons that expressed the concept of arithmetic reasoning.

Related concepts include the arithmetic operation concepts of addition, subtraction, multiplication and division, the logical connection concepts in the arithmetic reasoning process (such as "...so" "...next"), and other arithmetic calculation concepts (such as "percentage", "algorithm" and "formula").

Therefore, in order to discover the concepts represented by each neuron, they mapped the neurons to the vocabulary space of the large model, and summarized the meaning represented by the neuron by marking the proportion of concepts on each vocabulary after the neuron was mapped.

The research team proposed using GPT-4 to read the vocabulary mapping of comprehension neurons to automate the neuron labeling and mining process.

Experiments show that there are indeed neurons in the Transformer feedforward layer that represent arithmetic concepts. When these neurons are destroyed, the arithmetic reasoning ability of the large model will be impaired.

At the same time, the researchers also observed that the activity of these neurons was positively correlated with the arithmetic reasoning ability of the large model. This positive correlation explains why different prompt sentences have different effects on the arithmetic reasoning of the large model.

Based on these neurons, the team systematically explained four CoT-related phenomena observed in existing studies.

First, when mathematical formulas are removed from CoT samples and only the calculation results are left, the arithmetic reasoning ability of the large model will be impaired.

Second, when textual reasoning was removed from the CoT sample and only mathematical formulas were left, model capability was also impaired.

Third, when CoT samples lose operational diversity, such as when all samples only involve addition operations, the model capability is impaired.

Fourth, when the calculation results of the CoT sample are wrong but the reasoning process is correct, the model ability is not significantly affected.

"We see that these phenomena can basically be explained by the degree of neuronal activation. For example, the number of neurons in the activated state decreases before and after the mathematical formulas are removed, which explains why the model's arithmetic reasoning ability is impaired," the researchers explained.

From the application point of view, this achievement will have application prospects in two aspects.

First, the ability to predict large models.

In experiments, researchers have been able to see that the activation of neurons representing arithmetic reasoning is positively correlated with the arithmetic reasoning ability of the Llama2 model. This means that in the future, there may be no need for benchmarking to directly predict the ability of large models in specific tasks.

At the same time, because benchmarking requires a lot of manpower and material resources, such as dataset annotation and computing resources, directly predicting the capabilities of a large model by understanding its internal mechanisms can also help save costs.

In addition, practitioners in this field hope that in the near future, large models will be able to complete tasks beyond human capabilities. However, due to the limitations of human capabilities, there is no way to build benchmarks for these tasks. Predicting model capabilities through the internal mechanisms of large models can circumvent this problem.

Second, the capabilities of the model can be enhanced or weakened by controlling the internal mechanisms of the large model.

"We believe that this application will become one of the important methods to improve the security of large models in the future. It also has the potential to achieve more efficient training of large models, such as locating neurons through small data and then achieving the purpose of model training by controlling the activation of neurons." The research team said.

In fact, in the second half of 2023, OpenAI proposed a “super alignment” proposal [3], which aims to help humans supervise and control superhuman AI models by encouraging scientific research and innovation. Predicting and controlling model capabilities are two important tasks to achieve this goal.

"This result is our initial exploration in this direction, and we hope that we or other researchers can continue to explore in this direction in the future," the team said. The research was inspired by "mechanistic explanation".

This is a subfield of model interpretability that has rapidly emerged and received widespread attention in recent years. Different from previous interpretability methods, mechanism interpretability attempts to understand the behavioral mechanism of the model by reverse engineering the neural network.

Currently, such methods have been applied to explain the behavior and structural functions of large models.

“One of the studies that was very inspiring to us was the exploration of the Transformer feed-forward layer by researchers from the Allen Institute for Artificial Intelligence in the United States and Bar-Ilan University in Israel[4],” the researchers said.

The study found that when the large model predicts the next vocabulary unit, the model's Transformer feedforward layer builds predictions by continuously reinforcing related concepts in the vocabulary space. This concept reinforcement is achieved by activating neurons in the Transformer feedforward layer.

"This discovery at the mechanistic level inspired our hypothesis: the reason why CoT can stimulate the arithmetic reasoning ability of large models may be because it can effectively activate the neurons representing the concept of arithmetic reasoning in the Transformer feedforward layer, and these neurons help enhance the arithmetic reasoning ability of large models." The research team said.

Based on this, the research team wondered whether there is a mechanism that can directly enhance the arithmetic reasoning capabilities of large models, especially small-scale large models.

“This is a very meaningful thing, because small-scale large models enjoy unique computational efficiency, economic efficiency and security,” the team noted.

In the same period, they also saw some research on improving the capabilities of small-scale large models in specific fields or tasks by collecting high-quality data or modifying the training objective function. However, the application of mechanism interpretability in this regard is still in its emerging stage.

Despite this, the team's scientific research process was not all smooth sailing, and they even faced a "stuck" problem at the very beginning.

The biggest difficulty is that they do not fully understand the internal mechanism of arithmetic reasoning of large models, and naturally cannot achieve the envisioned model control.

"Therefore, my student Lai, the first author of the paper, and I decided to focus on explaining the arithmetic reasoning of the large model first," said Yao Ziyu.

But they soon encountered the next difficulty.

"Arithmetic reasoning" is a highly abstract concept, while the predictions of large models are performed at the level of specific vocabulary units.

If we want to understand the arithmetic reasoning ability of the large model from the perspective of "conceptual reinforcement of neurons in vocabulary space", we must first implement this highly abstract concept into specific vocabulary-level concepts.

To fill this gap, the research team first summarized several lower-level concepts related to arithmetic reasoning, including arithmetic operators, logical language expressions in arithmetic reasoning, and other arithmetic calculation concepts.

They used GPT-4 to efficiently label and search for neurons that expressed these low-level concepts. Then, they referred to previous research and verified these searched neurons.

"The experimental results prove that these neurons do play an important role in our experimental large model Llama2," the research team said.

This also gives them more confidence to continue exploring in this direction.

They thought of using the activation states of these neurons to uniformly explain the effects of CoT on the arithmetic reasoning ability of large models, including explaining several phenomena observed in previous work.

The results basically verified their hypothesis that the stimulating effect of different components of CoT on the arithmetic reasoning ability of large models can be explained by the activation of related neurons.

However, the study also pointed out that neuronal activation cannot explain all arithmetic reasoning performance of large models. At the same time, whether the researchers' findings on Llama2 are applicable to other large model groups remains to be further verified.

It is also reported that Yao Ziyu's laboratory currently has several full scholarship doctoral students for admission in the fall of 2025. For details, please visit the team's website https://ziyuyao.org/ or email for consultation.

References:

1.Wei, Jason, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V. Le, and Denny Zhou. Chain-of-thought prompting elicits reasoning in large language models.Advances in neural information processing systems 35 (2022): 24824-24837.https://doi.org/10.48550/arXiv.2201.11903

2.Daking,Rai,Ziyu,Yao,An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs.arXiv:2406.12288.https://doi.org/10.48550/arXiv.2406.12288

3.OpenAI. Introducing Superalignment. https://openai.com/index/introducing-superalignment/. 2023.

4.Geva, Mor, Avi Caciularu, Kevin Wang, and Yoav Goldberg.Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space.InProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 30-45. 2022.https://arxiv.org/abs/2203.14680

Layout: Chu Jiashi

01/

02/

03/

04/

05/