news

the big model is starting to learn to think like a human. how far is it on the road to agi?

2024-09-18

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

two months ago, the problem of weak mathematical ability of large models attracted widespread attention. many large models on the market could not even solve the simple math problem of "which is bigger, 9.11 or 9.9?" two months later, the industry is gradually solving the problem of limited mathematical ability of large models.
in the early morning of september 13th, beijing time, openai released a new series of reasoning models without any prior notice, including three models: o1-preview, o1 and o1-mini. this is openai's long-rumored "strawberry project" with advanced reasoning capabilities. according to openai, the new series of models performed well in data and coding, and scored 83% in the qualifying exam of the international mathematical olympiad (imo).
openai is not the only one trying to break through the limitations of mathematical capabilities. google's deepmind team has also previously launched an al system, alphaproof, to solve complex mathematical problems.
breaking through the limits of mathematical capabilities is a new step in the evolution of ai technology and on the road to agi (artificial general intelligence). simon see, professor at coventry university and global director of nvidia's ai technology center, believes that the industry's efforts to improve ai mathematical capabilities include combining llm (large language model) with other technologies. combining efforts of different technologies has created a potential driving force towards agi (artificial general intelligence).
how to solve the limitation of mathematical ability?
"this is a significant improvement in complex reasoning tasks and represents a new level of artificial intelligence capabilities," openai wrote when introducing the o1 series of models. openai ceo sam altman also said on social media that the new model is the beginning of a new paradigm, that is, ai can perform general complex reasoning.
enhanced mathematical ability is an important feature of this series of models. openai introduced that the performance of the new series of models after the update is similar to that of doctoral students completing challenging benchmark tasks in physics, chemistry, and biology. in the qualifying exam of the international mathematical olympiad (imo), gpt-4o only correctly solved 13% of the problems, while the new model scored 83%.
regarding how the new model can achieve better math and programming capabilities, openai said that the company used a large-scale reinforcement learning algorithm to "teach" the model to think efficiently with chains of thought when training data efficiently, similar to how humans think for a long time before answering difficult questions. as reinforcement learning increases and thinking time increases, o1 performance continues to improve. openai researcher noam brown said that o1 has opened up a new dimension for large model scaling, so that large models are no longer limited by the bottleneck of pre-training, and can now also expand reasoning calculations. as for the role of enhanced reasoning capabilities, openai said that it can be used to annotate cell sequencing data in the healthcare field, generate complex mathematical formulas in the field of physics research, etc.
google deepmind enhances the final performance of the ai ​​system by combining other technologies besides llm. alphaproof is also based on reinforcement learning and is a system for mathematical reasoning. the system trains itself to prove the lean programming language (a programming language used to help verify theorems), and combines the training language model with the alphazero reinforcement learning algorithm. according to google, lean enables the system to verify correctness when it comes to mathematical reasoning proofs. when encountering a problem, alphaproof generates candidate solutions and then proves or refutes them by searching for possible proof steps in lean.
regardless of whether the technical principles are similar, alphaproof and openai o1 tend to think deeply compared to previous models, rather than relying solely on llm's ability to predict and quickly generate the next token.
how to get to agi?
previously, a large model developer told reporters that one reason for the weak mathematical ability of large models is that a large amount of high-quality mathematical data is not used to train the model. as the data quality improves, the problem of weak mathematical ability can be solved. however, in addition to the training data, industry analysis shows that llm's poor mathematical ability is also because the method of predicting the next token is not really intelligent. judging from recent developments, the industry, including openai and google deepmind, is solving the problem of poor mathematical and reasoning ability from the operating mechanism of ai systems. in fact, it is using various technologies to make up for the shortcomings of llm's operating method, and to some extent, make llm's way of thinking more like humans.
the industry is still discussing the root causes and solutions to the limitations of llm capabilities, how to solve problems such as mathematical ability, and how to move from existing llm to agi. several senior industry insiders recently discussed this at the gain summit world artificial intelligence summit hosted by the saudi data and artificial intelligence authority. at the summit, simon see said that current artificial intelligence is "narrow". many people believe that llm will become the driving force for achieving agi, but people don't really understand how it works and are still on the verge of developing llm. there are still many problems that need to be solved, such as the inability to build larger and larger models because unlimited energy cannot be provided.
"we now have a lot of data, and if we train the model to be large enough, capabilities will emerge. in my opinion, relying on a single technology is not feasible. the industry is now working on combining llm with other knowledge and technologies such as new symbols and calculus for understanding and reasoning." simon see said that the combination of different technologies has made great progress recently. deepmind's alphaproof combines lean programming language and language models to enable ai to be used for mathematical proofs. combining llm with various technologies gives ai systems the potential to move toward agi.
antoine blondeau, co-founder and managing partner of alpha intelligence captal, also believes that it is inevitable that machines will eventually be better than humans, but it will take some time to achieve this result, and there is still a lot of scientific work to be done. he believes that ai will not be a single model, but may be a combination of multiple models. machines will eventually learn to observe, prove or refute, and generalize like humans, and learn in the real world.
regarding the current mechanism and limitations of llm, antoine blondeau believes that humans learn from life, 95% of which is from "videos with sound". the essence of our lives is basically "opening videos", and the other 5% comes from texts such as books. humans learn semantics from videos. for example, when five fingers appear, it means that it may be a human or other animal. humans also understand the order of time and the cause and effect of events from videos. but when a machine learns from a video, its task is to predict the next pixel, which is not the human way. if we cannot let machines learn like humans, it will be difficult for machines to reach a higher level of intelligence.
alex smola, a well-known machine learning scientist and founder and ceo of boson ai, a large model startup, pointed out that the limitations of llm's operation are also related to token prediction. he said that llm's ability to predict the next token (word) has been used to understand images, sounds, and make sounds. in the past 12 months, everything seems to have become a token.
"to some extent, we have begun to exhaust the number of available tokens. a rough estimate is that there may be 100 trillion tokens, which may be the tokens that humans can use to build llm. there is still a lot of video and audio supply, which will play a role to some extent. this also depends on nvidia or other companies to produce chips that can process these modalities." alex smola said that in the foreseeable future, the core of llm may be sequence modeling. now we can see the convergence of data and hardware, and probabilistic models are also evolving towards similar structures. we can see how far the related exploration can go in the next few years.
combining technological progress and looking into the future, antoine blondeau believes that agi may be achieved within 10 or 20 years, as the pace of evolution is very fast now. simon see believes that to achieve agi, perhaps 80% of this process can be achieved within 10 years, but he believes that the last 20% will be very challenging and will take longer.
(this article comes from china business network)
report/feedback