The first AI scientist is born! He has independently generated 10 academic papers and also became an AI reviewer

2024-08-13

Mengchen Hengyu from Aofei Temple
Quantum Bit | Public Account QbitAI

HistoryThe first "AI scientist”, came out of nowhere!

It was created in one go as soon as it appearedTen complete academic papers。

△A diffusion model paper generated by AI

From proposing research ideas, checking innovation, designing experiments, writing code, to executing experiments on GPUs and collecting results, and finally completing paper writing, all in one go.

All done automatically by this "AI scientist".

The cost per paper is approximately$15(about 107.62 yuan).

This is the first one forAutomated scientific researchand comprehensive AI systems for open discovery,The AI Scientist。

From the startup of Llion Jones, one of the Transformer authors:Sakana AI。

and!

This company has done more than just create an AI scientist.We also created an additional AI reviewer。

Reviewers can review papers written by AI and provide suggestions for improvement.

Help, what kind of nesting doll cycle is this, using my spear to attack my shield?

After all the operations, it is more like a human academic circle than a human academic circle (not really)

One more and!

Whether it is AI scientists or AI reviewers, Sakana AIAll open source.

Netizens applauded after watching it;

Nice Nice, very interesting work!

And some people have already started to come up with "bad ideas".

It is recommended to submit one of the papers to the top AI conference!

AI independently completed ten machine learning papers

For decades, after every major advance in AI, researchers have often joked, “It’s time to study how AI can help us write papers”。

Now, this idea has finally turned from a joke into reality.

Specifically, AI scientists generated ten papers and selected one with a higher score from each research direction to introduce.

First article, diffusion model directionTwo-Scale Diffusion: Adaptive Feature Balancing for Low-Dimensional Generative Models

An adaptive dual-scale denoising method is proposed to improve the problem that the existing diffusion model is difficult to capture both global structure and local details in low-dimensional space.

method:

Design a dual-scale architecture, including global and local branches
Introducing a learnable time-step conditional weighting mechanism
Combine the outputs of the two branches for denoising prediction

Experimental results:

The KL divergence metric is reduced by 2.5% to 12.8% compared to the baseline model (the lower the better).
However, the computation time is approximately doubled, and the performance is unstable on complex data distributions (such as the dino dataset).

A quick glance at the main text shows that it contains formulas and charts, and looks quite presentable.

Second article: Language model directionStyleFusion: Adaptive Multi-Style Generation in Character-Level Language Models.

This paper proposes a new method called Multi-Style Adapter, which enhances the style awareness and consistency of character-level language models by introducing learnable style embeddings and style classification heads.

We achieve near-perfect style consistency scores on all datasets (0.9667 for shakespeare_char, 1.0 for enwik8 and text8), outperforming the baseline in validation loss at the cost of a slight drop in inference speed (~400 tokens/s vs. 670 tokens/s for the baseline).

Part 3: Combining Transformer with Reinforcement LearningAdaptive Learning Rates for Transformers via Q-Learning.

This study explored the application of reinforcement learning to dynamically adjust the learning rate in transformer model training, using validation loss and current learning rate as state to dynamically adjust the learning rate to optimize the training process.

The results outperform the baseline models on all datasets and also show advantages in training time.

The fourth article studies the phenomenon of "Grokking" proposed by the Google teamUnlocking Grokking: A Comparative Study of Weight Initialization Strategies in Transformer Models

This paper is the first to systematically study the impact of weight initialization on grokking and compares five weight initialization strategies for optimizing neural network learning dynamics.

turn out:

Xavier initialization performs best in most tasks, reducing the number of steps to reach 99% validation accuracy by up to 63%.
Orthogonal initialization performs well in some tasks but performs poorly in others.

The codes for these papers (also generated by AI) are also open sourced on GitHub, highlighting the reproducibility of the code.

In addition, the team found that "AI scientists" have someInteresting but somewhat dangerous behavior：

In one experiment, it modified its own code to complete the research.Let the system call itself iteratively, and finally turned into an infinite nesting doll.

Another time, faced with a human-set runtime limit, the AI did not try to speed up its efficiency, but instead relaxed its own requirements.Extended the time limit from 2 hours to 4 hours。

How the first "AI scientist" was created

The whole research idea comes from the continuation of several achievements after the establishment of Sakana AI:

First, they developed methods to automatically merge knowledge from multiple large models to evolve new models. In more recent work, they used large models to discover new objective functions to tune other models.

During these projects, the team was constantly surprised by the creativity of current cutting-edge models, which led to bigger dreams:Can a large model be used to automate the entire research process?

The final result was achieved through collaboration between Sakana AI, the Foerster Laboratory at the University of Oxford, and the University of British Columbia team.

The "AI Scientist" system consists of four parts.

Idea Generation:

Given a starting template, AI first "brainstorms" a series of different novel research directions and searches on Semantic Scholar to verify whether these ideas have been done before.

Experiment iteration:

For the ideas presented in the first part, the “AI Scientist” first performs the proposed experiments and then generates graphs to visualize the results.

Essay Writing:

I wrote a concise and informative LaTeX article in the style of a standard machine learning conference, and also used Semantic Scholar to find relevant papers for citation.

Automated peer review:

An automated “AI reviewer” was developed that can evaluate generated papers with near-human accuracy, enabling a continuous feedback loop that allows “AI scientists” to iteratively improve their research outputs.

A total of 10 papers were generated as follows:

During the experiment, the team also compared the effects of connecting different mainstream large models to the entire system, including the domestic code large model of the DeepSeek team.

turn out,Claude-Sonnet-3.5 performed best in terms of idea innovation, test pass rate, and paper completion quality.

GPT-4o and DeepSeek Coder perform similarly, but the latter is 30 times cheaper.

Of course, at this stage, the papers independently completed by AI are not perfect, nor can they be directly published in top conferences.

Human researchers have identified several limitations and challenges:

The current "AI Scientist" system does not have integrated visual capabilities, the generated charts are sometimes difficult to read, the tables sometimes exceed the page width, and the page layout is poor.
AI scientists may have the right idea but execute it incorrectly, or make unfair comparisons to baselines, producing misleading results.
AI scientists occasionally make serious mistakes, such as hallucinations, when writing and evaluating their results.

We also want to create regional chairs and new AI summits

To sum up, the papers written by this first generation of AI scientists still contain some bugs from time to time.

But the project itself, and the cost of $15 per paper, is called "promising" by Sakana AI and could be used to help accelerate scientific progress.

Sakana AI also released an explanatory article stating that the ultimate vision of AI scientists is aA scientific ecosystem driven entirely by AI。

The system includes not only large model-driven researchers, but also reviewers, regional chairs, and a new top conference.

It is important to note that Sakana AI believes that:

The role of human scientists will not be diminished by the emergence of AI scientists.

If we have to make a comparison, it is that scientists have to adapt to the emergence and application of new technologies, adapt to the changes in their role positioning, and "move up the food chain."

Moreover, it remains to be seen whether AI scientists can actually come up with a truly new paradigm.

After all, this thing is still built on Transformer.

Can it come up with something as powerful as the Transformer or Diffusion Model, or even theoretical concepts like artificial neural networks or information theory?

I don’t know either, and I dare not say.

Sakana AI also wrote:

We believe that AI scientists will become great partners to human scientists.
But only time will tell to what extent the essence of human creativity and serendipitous moments of innovation can be replicated through artificial open-ended discovery.

△Sakana AI: A fully automated AI fish exploring its world

From the Transformer author startup

The company that created the "new man" this time, Sakana AI, is also our old friend in a strict sense.

By the last of the 8 authors of the Transformer paperLlion JonesThe startup was established with the goal of building a "world-class artificial intelligence laboratory."

Company base in Tokyo, and sakana is the Japanese word for "fish" (fish).

Perhaps due to company culture considerations, Llion also indicated on LinkedIn that he had chosen a Japanese transliteration name: ライオン (which is the katakana of Lion; hereinafter affectionately referred to as Lion Brother).

The company was announced in August last year.

At that time, Lion Brother said without hesitation that he had no ill will towards Google, butGoogle does make him feel "trapped"。

Before starting his own business, Lion Brother had worked at Google for 8 years.

△Guess who has half of his face missing

He graduated with a bachelor's and master's degree from the University of Birmingham, and has worked at Delcam, YouTube, and Google. Google is the company where he stayed the longest.

According to FourWeekMBA, in his previous work experience,“I missed out on a job at Google twice”。

The first time was when he was looking for a job after graduation. Although he submitted his resume to Google London Software Engineer and passed two rounds of telephone interviews, he ultimately chose Delcam, a UK-based CAD/CAM software company, over Google.

It is worth mentioning that before getting the Google offer, Lion happened to encounter the 2009 economic crisis. He couldn't find a job and had to rely on welfare to make ends meet for several months.

The second time was 18 months after he started working there, when he received another call from Google asking if he wanted to reapply. But he still didn’t go to Google, and instead joined YouTube.

While working as a software engineer at Youtube for three years, he became interested in artificial intelligence, taught himself Coursera's machine learning courses, and finally joined Google Research in 2015 as a senior software engineer.

It was during this period that he published the famous Transformer paper with seven other authors.Attention Is All You Need。

In addition, Lion Brother has also participated in a lot of research at Google, including ProtTrans, Tensor2Tensor, etc.

The reason he chose to leave Google was that the company had grown to a size that made it impossible for him to continue doing the work he wanted to do.

In addition to wasting energy every day troubleshooting other people's bugs, he also needed to spend time looking for resources from the company and trying to gain access to certain data.

After starting a business, Sakana AI's work has been progressing in an orderly manner.

Before introducing AI scientists and AI reviewers, we also introduced large models that merged evolutionary algorithms and studied the flow of information within Tranformer.

As for the AI scientist and AI reviewer projects, they are a collaboration between Sakana AI, Oxford, and UBC.

The three co-authors are:

Chris Lu, an intern at Sakana AI, working as a research scientist at the company.

He graduated from UC Berkeley with a bachelor's degree and is currently a third-year doctoral student at Oxford University, with Jakob Foerster as his advisor.

Chris's current important research direction is to apply evolution-inspired techniques to meta-learning and multi-agent reinforcement learning.

In the summer of 2022, he interned at DeepMind as a research scientist.

Cong Lu, a postdoctoral researcher at UBC (University of British Columbia), with Jeff Clune as his supervisor.

Cong studied at RGU (Robert Gordon University) and obtained his PhD from Oxford University in 2019. His main research areas are open reinforcement learning and AI scientific discovery.

Previously, he interned at Waymo and Microsoft.

Robert Tjarko Lange, one of the founding members of Sakana AI and a research scientist at the company.

He is currently completing his final year of doctoral studies at the Technical University of Berlin, where his research direction is evolutionary meta-learning.

This guy has a master's degree in computer science from Imperial College London, a master's degree in data science from Pompeu Fabra University, and a bachelor's degree in economics from the University of Cologne.

Last year, he worked as a full-time student researcher in Google DeepMind's Tokyo team.

Paper address:
https://arxiv.org/abs/2408.06292

Reference Links:
[1]https://x.com/SakanaAILabs/status/1823178623513239992
[2]https://sakana.ai/ai-scientist/

news