news

li yanhong's internal speech was exposed: talking about the three major cognitive misunderstandings of large models, the gap between models will become larger in the future

2024-09-16

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

"there are quite a lot of misunderstandings about big models in the outside world," according to media reports recently, an internal speech by li yanhong was exposed. in a recent communication with employees, li yanhong talked about three cognitive misunderstandings about big models, covering big model competition, open source model efficiency, and intelligent agents.trendand other hot topics.



li yanhong mentioned that the gap between large models may become larger in the future. he said that the ceiling of large models is very high, and it is still far from the ideal situation, so the model must be continuously and rapidly iterated, updated and upgraded; it needs to be invested for several years or even more than ten years to continuously meet user needs and reduce costs and increase efficiency.



the following is the content of the internal speech

q: some people believe that there are no longer any barriers between the capabilities of large models?

robin li:i disagree with this statement. i think there are quite a lot of misunderstandings about big models. every time a new model is released, everyone wants to say how good it is.comparing with gpt-4o, using test sets or making some rankings, saying that my score is almost the same as its, or even exceeds it in some individual items, but this does not prove that the gap between these newly released models and the most advanced openal models is not that big.



the gap between models is multi-dimensional. one dimension is the ability, whether it is the gap in basic abilities such as understanding ability, generation ability, logical reasoning ability or memory ability; the other dimension is the cost.if you want to have this capability or want to answer these questions, how much will it cost you? some models may have a very slow inference speed. although they can achieve the same effect, in fact, their experience is still not as good as the most advanced models. in addition, for the test set,over-fitting, every model that wants to prove its ability will go to the rankings. when competing for the rankings, it has to guess what others are testing and which questions i can answer correctly with what techniques. so from the rankings or test sets, you think your abilities are very close, but there is still a clear gap in actual application.



the hype of some self-media, coupled with the motivation to promote each new model when it is released, has given people the impression that the difference in capabilities between models is relatively small, but this is not the case. in actual use, i do not allow our technical staff to compete for rankings. the real measure of the ability of the wenxin model is whether it can meet the needs of users in specific application scenarios and whether it can generate value gains. this is what we really care about.



we need to see that, on the one hand, there is still a relatively obvious gap between model capabilities, and on the other hand, the ceiling is very high. what you have achieved today is still far from what you actually want to do and the ideal state, so the model needs to be continuously and quickly iterated, updated, and upgraded.even if you think the gap may not be that big today, will you see that it has widened in a year? is there anyone who can continuously invest in this direction for several years or even more than ten years, so that it can increasingly meet user needs, scenarios, and meet the needs of improving efficiency or reducing costs?the gap between different models is not getting smaller, but getting bigger. it’s just that when they don’t know the real needs, they may think that they are enough just by doing the questions in the test set.

the so-called leadingi don't think it's that important to be 12 months or 18 months behind. each of our companies is in a fully competitive market environment. no matter what direction you go in, there are many competitors. if you can always guarantee to be 12 to 18 months ahead of your competitors, you will be invincible. don't think that 12 to 18 months is a short time.even if you can guarantee that you will always be ahead of your competitors6 months, that's a win., your market share may be70%, while the competitor may only have 20% or even 10% share.



q: some say the open source model is narrowing the gap with the closed source model. will this destroy the business model of closed source big model companies?

robin li:this question is highly related to the previous one. i just said that besides the ability or effect, a model also needs to be efficient. the open source model is not efficient. to be precise, the closed source model should be called a commercial model. the commercial model is that countless users or customers share the same resources, sharing the r&d costs, the machine resources used for reasoning, andgpu, but the open source model requires you to deploy a set of things yourself. what is the utilization rate of the gpu after deployment?wenxin large model3.54.0 or above, the usage rate is more than 90%. how many people are using an open source model you deploy? we tell the public that the wenxin model is called more than 600 million times a day, and the number of tokens generated every day exceeds one trillion. which open source model can say how many times it is called a day and how many tokens it generates? if no one uses it, how do we share the cost? how can the inference cost be compared with the commercial model?



before the era of big models, people were used to the idea that open source meant free and low cost. at that time, every version of commercial products on the market had to be paid for, such as buying a computer withmicrosoft may charge you a certain amount of money for windows, but you don’t have to spend this money if you run linux. since linux is open source, all programmers can see the code. if there is something wrong, i can update it and check it in. everyone can make progress together. you can continue to make progress on the shoulders of giants. but these things are not true in the era of large models. in the era of large models, people often talk about how expensive gpus are. computing power is a key factor in determining the success or failure of large models.does the open source model provide you with computing power? if it does not provide you with computing power, how can you use it efficiently? the open source model cannot solve this problem.



in the past, when you bought a computer, you already paid for computing power, but this is not the case for reasoning with large models, which is actually very expensive. therefore, the value of open source large models lies in the fields of teaching and scientific research. if you want to understand the working principle of the large model, you will definitely be at a disadvantage if you don’t know the source code. however, in the real business field, when you are pursuing efficiency, effect, and the lowest cost, open source models have no advantage.



ask:how will ai applications evolve? why is the emphasis on intelligent agents?

robin li:the development process of large models must go through these stages. at the beginning, it is to assist people. in the end, people are needed to pass the final hurdle. we determine its effect.ok, it will be allowed to go out only when everything is good, this is the copilot stage; the next step is the agent intelligent body. there are various definitions of agent in the outside world, but the most important one is that it has a certain degree of autonomy, and has the ability to use tools, reflect, and self-evolve; this level of automation goes down to a so-called al worker, which can do all kinds of mental and physical labor like a human, and can complete all kinds of work independently. there must be such a process.



the judgment that "intelligent agents are the most important development direction for large models" is actually a non-consensus. at the baidu create conference, we launched three products, agentbuilder, appbuilder, and modelbuilder. agentbuilder and appbuilder are both about intelligent agents, one has a lower threshold, and the other has more powerful functions. after we finished explaining, some people finally began to understand that this thing is indeed interesting, can generate value, and can already produce something that everyone feels usable with a relatively low threshold. since then, the popularity of intelligent agents has gradually increased, and many people have begun to be optimistic about the development direction of intelligent agents.however, to date, there is still no consensus on intelligent agents, and there are not many companies like baidu that regard intelligent agents as the most important strategy and development direction of large models.



why do we emphasize intelligent agents so much? because the threshold for intelligent agents is indeed very low. last year, we said that we would roll out applications and everyone should make applications. in fact, many people still don’t know how to do it. they don’t know whether this direction can be achieved. what capabilities do i need to use to generate value in this scenario? there are countless uncertainties here. everyone doesn’t know how to turn models into applications.but the intelligent agent provides a very direct, very efficient and very simple way. it is quite convenient to build an intelligent agent on top of the model., which is why tens of thousands of new intelligent agents are created on the wenxin platform every week today.



we have seen the trend in intelligent agents and have good prerequisites. in addition to the powerful capabilities of the model itself, we also have a good distribution channel.app, especially baidu search hashundreds of millions of peopleduring use, users actively express their needs to us, and which intelligent agent can better answer their questions and meet their needs. this is a natural matching process, so we are the most capable of helping these developers distribute their intelligent agents.

report/feedback