Big models have their own understanding of language! MIT paper reveals the thinking process of big models

Big models have their own understanding of language! MIT paper reveals the thinking process of big models | ICML 24

2024-08-17

Cressey from Aofei Temple
Quantum Bit | Public Account QbitAI

The big model can form your own understanding of the real world!

An MIT study found that as a model becomes more capable, its understanding of reality may go beyond simple imitation.

For example, if the big model has never smelled anything, does it mean that it cannot understand smells?

Research has found that it can spontaneously simulate some concepts, making them easier to understand.

This study means thatBig models will hopefully give us a deeper understanding of language and the world in the futureThe paper has been accepted by the top conference ICML 24.

The authors of this paper are Charles Jin, a Chinese doctoral student at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL), and his mentor, Professor Martin Rinard.

During the study, the authors let the large model learn only the code text, and found that the model gradually grasped the meaning behind it.

Professor Rinard said the research directly addresses a core problem of modern artificial intelligence:

Are the capabilities of large models simply due to large-scale statistical correlations, or do they generate a meaningful understanding of the real-world problems they are designed to address?

△Source: MIT official website

At the same time, this research has also sparked a lot of discussion.

Some netizens said that although the large model’s understanding of language may be different from that of humans, this research at least shows that the model does more than just memorize training data.

Let big models learn pure code

To explore whether large models can produce semantic understanding, the authors constructed aProgram code and its corresponding input and outputA synthetic dataset composed of

These codes are written in aKarelIt is written in the teaching language of , and is mainly used to realize the task of robot navigation in a 2D grid world.

This grid world consists of 8x8 grids, each of which can contain obstacles, markers, or empty spaces. The robot can move between grids and perform operations such as placing/picking up markers.

The Karel language contains five primitive operations: move (move one step forward), turnLeft (turn left 90 degrees), turnRight (turn right 90 degrees), pickMarker (pick up a marker), and putMarker (place a marker). The program consists of a sequence of these primitive operations.

The authors randomly generated a training set of 500,000 Karel programs, each with a length between 6 and 10.

Each training sample consists of three parts: 5 input states, 5 output states and complete program code. The input and output states are encoded into a string in a specific format.

Using this data, the authors trained a variant of the CodeGen model of the standard Transformer architecture.

During training, the model can access the input and output information and program prefixes in each sample, butThe complete execution trajectory and intermediate states of the program cannot be seen。

In addition to the training set, the authors also constructed a test set containing 10,000 samples to evaluate the generalization performance of the model.

In order to study whether the language model has grasped the semantics behind the code and gain in-depth understanding of the model’s “thinking process”, the authors designed a combination of detectors including linear classifiers and single/double hidden layer MLPs.

The input of the detector is the hidden state of the language model during the process of generating program tokens, and the prediction target is the intermediate state of program execution, which includes three features: the robot's direction, the offset relative to the initial position, and whether it is facing an obstacle.

During the training process of the generative model, the author records the above three features every 4000 steps, and at the same time records the hidden state of the generative model to form the training data set of the detector.

Three Phases of Large Model Learning

By observing the changes in the diversity and perplexity of the programs generated by the language model as the training progresses, the author divides the training process into three stages:

Babbling stage: The output program has high repetitiveness and the detector accuracy is unstable.
Grammar acquisition stage: Program diversity increases rapidly, generation accuracy increases slightly, and perplexity decreases, indicating that the language model has learned the syntactic structure of the program.
Semantic acquisition stage: The program diversity and syntactic structure mastery level remain stable, but the generation accuracy and detector performance are greatly improved, indicating that the language model has learned the semantics of the program.

Specifically, the Babbling stage occupies the first 50% of the entire training process. For example, when the training reaches about 20%, no matter what specifications are input, the model will only generate a fixed program - "pickMarker" repeated 9 times.

During the grammar acquisition phase, which is between 50% and 75% of the training process, the model’s perplexity on the Karel program drops significantly, indicating that the language model begins to better adapt to the statistical characteristics of the Karel program. However, the accuracy of the generated program does not increase much (from about 10% to about 25%), and it is still unable to complete the task accurately.

The semantic acquisition stage is the last 25%, and the accuracy of the program increases dramatically, from about 25% to more than 90%. The generated program can accurately complete the given task.

Further experiments have found that the detector can not only predict the simultaneous steps at time t, but alsoPredict the program execution state at subsequent time steps。

For example, suppose the generative model generates the token "move" at time t, and will generate "turnLeft" at time t+1.

At the same time, the program state at time t is that the robot faces north and is located at coordinates (0,0), while at time t+1 the robot will face west and its position remains unchanged.

If the detector can successfully predict from the hidden state of the language model at time t that the robot will face west at time t+1, it means that before generating "turnLeft", the hidden state already contains the state change information brought about by this operation.

This phenomenon shows that the model does not only have a semantic understanding of the generated program part, but also has an expectation and plan for the next content to be generated when generating each step, showing a preliminary understanding of the program.Future-oriented reasoning capabilities。

But this discovery brought new problems to the study.

Is the improvement in accuracy observed in the experiment really the result of improvements in the generative model, or is it the result of the detector’s own inference?

To resolve this confusion, the author addedSemantic Probe Intervention Experiment。

The basic idea of the experiment is to change the semantic interpretation rules of program operations, which can be divided into two methods: "flip" and "adversarial".

"flip" is to forcibly reverse the meaning of the command, such as forcibly interpreting "turnRight" as "turn left", but only "turnLeft" and "turnRight" can perform this kind of reversal;

"Adversarial" randomly shuffles the semantics of all instructions, as shown in the table below.

If the hidden state of the generative model only encodes the syntactic structure of the program but not the semantic information, then the detector should still be able to extract this altered semantic information from the hidden state with equal performance.

On the contrary, if the detector performance drops significantly, it means that the performance improvement shown by the detector is indeed because the hidden state of the generative model encodes the actual semantics.

Experimental results show that the performance of the detector has dropped significantly under both new semantics.

This is especially obvious in the "adversarial" mode, which is consistent with the fact that the semantics in this mode is more different from the original semantics.

These results strongly rule out the possibility that the detector "learned semantic mapping on its own" and further confirm that the generative model does grasp the meaning of the code.

Paper address:
https://icml.cc/virtual/2024/poster/34849
Reference Links:
[1]https://news.mit.edu/2024/llms-develop-own-understanding-of-reality-as-language-abilities-improve-0814
[2]https://www.reddit.com/r/LocalLLaMA/comments/1esxkin/llms_develop_their_own_understanding_of_reality/

news