2024-09-17
한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina
mingmin from aofei temple
quantum bit | public account qbitai
less than a week after its release, the moat of openai’s strongest model o1 has disappeared.
someone discovered that a paper published by google deepmind in august revealed the principle and working method of o1.almost identical。
this study shows that increasing test-time computation is more effective than expanding model parameters.
based on the compute-optimal test-time computational expansion strategy proposed in the paper, a smaller base model can outperform a14 times largermodel.
netizens said:
this is almost the principle of o1.
as we all know, ultraman likes to be ahead of google, so this is why o1 released the preview version first?
some people lamented:
it is indeed as google itself said, no one has a moat, and no one will ever have a moat.
just now, openai increased the speed of o1-mini by 7 times, and it can be used 50 times a day; o1-preview mentioned 50 times a week.
the title of this google deepmind paper is:optimizing llm testing is more computationally efficient than increasing the size of model parameters。
the research team extended this idea from the perspective of human thinking patterns. since people spend more time thinking about improving decisions when faced with complex problems, can the same be true for llm?
in other words, when faced with a complex task, can llm more effectively utilize the additional computation during testing to improve accuracy?
some previous studies have demonstrated that this direction is indeed feasible, but the effect is relatively limited.
therefore, this study wants to find out how much the model performance can be improved by using relatively few additional inference calculations?
they designed a set of experiments and tested palm2-s* on the math dataset.
two main methods were analyzed:
(1)iterative self-revision: let the model try to answer a question multiple times, revising it after each attempt to get a better answer.
(2) search: in this approach, the model generates multiple candidate answers,
it can be seen that when using the self-revision method, the gap between the standard best-of-n strategy and the computationally optimal extended strategy gradually widens as the amount of computation during testing increases.
using the search method, the optimal expansion strategy shows a clear advantage in the early stage. and under certain circumstances, it can achieve the same effect as the best n strategy.the computational effort is only 1/4 of that。
in an evaluation matching flops with pre-training compute, we compare palm 2-s* (using the compute-optimal strategy) to a 14x larger pre-trained model (without additional inference).
the results show that when using the self-revision method, when the inference tokens are much smaller than the pre-training tokens, the effect of using the test-time calculation strategy is better than the pre-training effect. however, when the ratio increases, or on more difficult problems, the pre-training effect is still better.
that is, in both cases, the key to calculating whether the extension method is effective is based on different tests.difficulty of hints。
the study further compared different prm search methods and the results showed that forward search (far right) required more computation.
when the amount of computation is small, using the computationally optimal strategy can save up to 4 times the resources.
compared with openai's o1 model, this study almost came to the same conclusion.
the o1 model learns to refine its thought process, try different strategies, and recognize its mistakes. and with more reinforcement learning (calculated during training) and more thinking time (calculated during testing), o1's performance continues to improve.
however, openai released the model faster, while google used palm2 and has not released any updates on gemini2.
this new discovery inevitably reminds people of the viewpoint put forward in google’s internal document last year:
we have no moat, and neither does openai. open source models can beat chatgpt.
nowadays, each company is conducting research at a very fast pace, and no one can ensure that they will always be ahead.
the only moat may be hardware.
(so musk is building a computing center?)
some say that right now nvidia directly controls who has more computing power. so what if google/microsoft develops a custom chip that works better?
it is worth mentioning that openai’s first chip was exposed some time ago. it will use tsmc’s most advanced a16 angstrom-level process and is specially designed for sora video applications.
obviously, for a large model battlefield, just rolling up the model itself is not enough.
reference links:
https://www.reddit.com/r/singularity/comments/1fhx8ny/deepmind_understands_strawberry_there_is_no_moat/