news

Dialogue with Huoshan Tan Dai: Today the industry charges by token, but this model will not be the only one in the future

2024-08-06

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Text|Deng Yongyi

Editor|Su Jianxun

Cover source|Provided by the company

Emergence is a key phenomenon in the generative AI wave: when the model scales up to a critical point, AI will show human-like intelligence, able to understand, learn, and even create. "Emergence" also happens in the real world - silicon-based civilization is on the verge of taking off, and entrepreneurs and creators in the field of AI are using their wisdom and brains to light up the long journey to achieve AGI. At the time of the transition between old and new productivity, "Emergence of Intelligence" launches a new column "36 Emergences", in which we will record new thinking at this stage through dialogues with key figures in the industry.

The 2024 Volcano Engine, contrary to its previous low-key style, is appearing in the large model market at a rapid pace.

Not long ago in May, Huoshan just held a striking press conference, which brought the model price down to the "floor price". Doubao's main model Doubao Pro-32k was reduced to 0.0008 yuan/thousand tokens, a 99.3% drop from the industry average. For a time, the whole industry was in an uproar, and manufacturers followed suit.

But two months later, Volcano Engine, which had "turned the table" on price, was no longer willing to talk about the price issue. "Now, using 1 billion tokens only costs 1,000 yuan, and it doesn't make much sense to lower the price. What matters is how much the model capability is improved at the same price." Tan Dai, president of Volcano Engine, told "Intelligence Emergence".

To some extent, the stormy "big promotion week" in May - in addition to Huoshan, Alibaba, Tencent, iFlytek and other manufacturers have announced price cuts for large models, which is a symbol of the large model field entering a new stage. The consensus behind this is: although large models are already dazzling, the new AI cake they bring is still too small - computing power is too expensive, and users have no motivation to innovate. Therefore, manufacturers might as well make concessions.

The effect was immediate. The average daily usage of Doubao Big Model tokens has exceeded 500 billion. Before the price cut, this number was 120 billion.

Compared with terms like price war and revenue, Tan Dai is more concerned about how many users he has brought in and how much they have done with the big model. "We don't look at short-term revenue, but rather how many customers we have established deeper cooperation with through this matter and how many problems we have helped them solve. The results will come naturally," he said.

People tend to predict the future based on the past. Big models and cloud computing are both considered to be businesses with similar models: heavy investment in technology and engineering research and development in the early stages, but a high Matthew effect in the later stages. However, the premise is that the business scale based on them must grow to a very large scale in order to achieve real economies of scale.

This year's big models are developing along this path - manufacturers are lowering prices in rounds, making big model computing power more affordable and becoming an infrastructure like water, electricity and coal.

Tan Dai believes that in the future, large models will be implemented faster than cloud computing. Products such as ChatGPT have let the world know that AI is the future, unlike cloud computing, which has gone through a long "preaching". But in order to make everyone innovate without worries, price reduction is only the first step.

The explosion of AI applications requires continuous investment in overcoming the various technical difficulties of the model: control illusions, long texts, etc. "I don't think developers are anxious. It's too early to talk about Killer Apps now." He said that the current large models are still in the "big brother" era, "It will take many years for the iPhone to appear."

In 2021, Volcano officially entered the cloud market. It has only been three years since its launch, and it is still a newcomer in the cloud computing market. Therefore, AI is not only a ticket to a new era for Volcano Engine, but also an opportunity to overtake others - this is also the reason why Volcano has invested heavily in AI. In 2023, 70% of large model companies in China used the computing power services of Volcano Engine; and this year, MaaS (Model as a Service) will be the next battlefield.

The following is the transcript of the conversation between Intelligence Emergence and Tan Dai, which has been sorted and edited:

After the price "turned the table"

The Emergence of Intelligence:What’s been discussed most recently about Huoshan is that you are the first manufacturer among the big companies to start a price cut trend. People say you “turned the table over”. What’s the thinking behind this?

Tan Dai:Our core consideration is to build an application ecosystem.

The Emergence of Intelligence:Was it a difficult decision for you to lower the price? How long did you think about it?

Tan Dai:It's not difficult at all. When we officially launched it in August last year, the cost was definitely very high. Later, we continued to optimize the engineering and it was widely used internally. After about a year of engineering optimization, we began to consider price reduction.

The Emergence of Intelligence:Why is this decision not difficult?

Tan Dai:We are thinking, what is the most important thing about this matter? We want to make the business ecosystem prosperous.

There are several thresholds behind this. The first is model capability, which Doubao has already achieved. At our 515 conference, everyone saw that when we introduced the big model, we did not mention our own evaluation set or the evaluation results.

Because I think that everyone can use it. There are many third-party reviews, including Zhiyuan, OpenCompass, etc., which all say that Doubao works very well.

The second reason is that the cost is too high, so we first need to lower the price, and it must be a sustainable price. Because we are a To B business, if the pricing cannot support gross profit or profit in the long term, it is unsustainable.

The third is the applicability of the landing, including plug-ins, cases, etc. This requires the combination of many things. We reduce costs through engineering optimization, which is what Huoshan, a cloud vendor, does.

The Emergence of Intelligence:How did Volcano reduce the price to such a low level?

Tan Dai:We will continuously optimize our model structure and reasoning engineering, reduce costs, and release the benefits to the industry.

As a service, the larger the scale of the big model, the lower the cost. The larger the scale, the different loads can be handled, and staggered and mixed scheduling can be performed. The same principle applies to cloud computing, which reduces costs.

Second, when the scale is large enough, a little bit of optimization will bring enough benefits, and then there will be enough budget to build a good technical team.

The Emergence of Intelligence:This is a mutually reinforcing process. First, you have to make it cheaper, so that the scale can be larger, and then you can optimize it and get greater benefits.

Tan Dai:This is also due to the increase in the number of Doubao APP calls. As you can see from previous third-party data, Doubao APP ranks first among AI products. Within Douyin, more than 50 scenarios and business lines are using it extensively, and there are also many invited customers outside, which has supported the scale.

After the price reduction was announced on May 15, our call scale increased even faster, and we saw more areas for optimization.

The Emergence of Intelligence:What is the effect after the price reduction is officially announced?

Tan Dai:First of all, everyone has no burden to innovate in AI, and the scale is growing very fast. Many startups call more than 1 billion tokens every day. How much is 1 billion tokens now? Just 1,000 yuan. Compared with the release of the model on May 15, the average daily usage of tokens per customer has increased by 20 to 30 times.

And there are many usage scenarios that we have not thought of. For example, the original daily call volume of Doubao was more than 120 billion tokens. After the price reduction, it has now exceeded 500 billion tokens.

Secondly, when we announced the price reduction, some people said that price was not important, but gradually many manufacturers began to follow suit.

The Emergence of Intelligence:Your main model Doubao Pro 128k is priced at 0.005 yuan/1,000 tokens, which is 95.8% lower than the industry price, and the 32k model is 0.0008 yuan/1,000 tokens, which is 99.3% lower. This price can be said to be a "floor price". How is it determined?

Tan Dai:We set our goal first, to release the dividends and have a sustained price. We won’t lose money, but we don’t need to make too much either. I initially thought it was a 90% drop compared to the industry average, but I didn’t expect it to be a 99% drop later.

The Emergence of Intelligence:Is there room for price reduction in the future?

Tan Dai:From this perspective, price is no longer a bottleneck. After it has dropped so low, no matter how much it drops, the benefits it brings to users are actually not that great. What is more important now is to improve the model capabilities at the same price, which is a more meaningful thing.

The Emergence of Intelligence:Isn’t this a bit like the previous wave of AI innovation in CV (visual recognition)? The accuracy from 70% to 90% is very important, but in the end, the accuracy from 95% to 98% is actually not very meaningful, and customers can accept it.

Tan Dai:It may not be able to achieve 98%, but we should look at it from the other side. In the past, the bad cases (results that did not meet expectations) were 5%, but now they are 2%, which is still a doubling of the effect.

The Emergence of Intelligence:There are also many comments in the industry saying that the current price war in China is not good for the industry because training large models is expensive, so no one can make money.

Tan Dai:I don't agree with this view. From an industry perspective, this has made China's AI flourish, which is a good thing.

The same is true for cloud computing. Cloud entered the price war earlier, which is a good thing. It makes digital transformation of enterprises easier and reduces costs. We cannot just look at it from the perspective of one company.

The Emergence of Intelligence:In the past, in the context of the Internet, the term "price war" had too much of a negative connotation.

Tan Dai:I think there is a difference. In the past, the business model of To C Internet was different, and the wool came from the pig. But in the price war of To B, the company itself relies on this to charge. If it continues to provide this price, everyone will benefit in the end.

The Emergence of Intelligence:This is definitely good news for developers. But what about you internally? Some sales colleagues from cloud factories told us that they have no motivation to sell AI because AI cannot be sold at a high price. How do you think about this issue?

Tan Dai:That's a good question. I guess we have nothing to lose.

First of all, Volcano is a cloud platform, and customers don’t just want big models. They actually want an overall solution that includes big models, cloud, and data products.

The unit revenue of the big model itself has decreased, but through the big model, you can help customers solve more problems and you will have more opportunities to do more business with them in the future. If the big model is done well, customers may even tell us that it is not possible and that their IT architecture needs to be restructured, and then they can do this based on Volcano.

The Emergence of Intelligence:How do you understand the reconstruction of IT architecture? Where are the spaces and opportunities?

Tan Dai:In the past, many things in IT expenditures were not solved through the cloud, but through the addition of software functions - all of which were done by manpower, and many things could not be converted into computing power.

But with the big model, many scenarios, whether Copilot or Autopilot, can be served by the big model, and the underlying layer of the big model is the cloud. The market that the cloud could not reach before has now become a market that it can reach.

The Emergence of Intelligence:How much do customers recognize AI now? When cloud computing first came out, many corporate users equated cloud computing with the concept of "advanced" and then went for digitalization. Has the big model achieved this effect now?

Tan Dai:It is not the case that after enterprises adopt AI, all IT environments will be immediately replaced by AI. This is definitely not possible. What they need to do is to first identify certain scenarios where AI can be used to improve efficiency.

For example, we have seen some customers whose upstream and downstream services collapse once the model calls increase in volume. They will then transform these architectures based on cloud native to support this load.

The Emergence of Intelligence:To what extent will these cases be reflected in income?

Tan Dai:It's not all about revenue, and we don't look at the short term right now. We look at how many customers we have established deeper cooperation with through this matter, how many problems we have helped them solve, and results will naturally emerge in the end.

The Emergence of Intelligence:If you don’t consider income in the short term, what indicator do you value most now?

Tan Dai:The number of tokens used by enterprises, but I think this model will not be the only one in the future. I used to know a friend who used to write novels, but later "gave up writing and turned to poetry". I asked, why did you make this choice? He said that poetry is charged by line, while novels are charged by word - the same 100 words, the price is different.

The business model of the big model will definitely change, and the final economic model is a more end-to-end model. For example, if there is an agent, you pay it based on how many problems it helps you solve. The same 100 words may be produced by different people, which is a more advanced business model.

The failure of AI applications is essentially due to model capability issues

The Emergence of Intelligence:Starting this year, the industry is discussing an old term: Killer App. Everyone is wondering why there are no killer apps yet. As an important infrastructure provider, what do you think of the current stage of AI applications?

Tan Dai:I think there is already a killer app, ChatGPT. Whether in terms of the number of users or revenue, it is faster than all Internet products, including TikTok and Douyin, which is already a very strong signal.

Now, when talking about Killer App, I think I am still looking at this from the perspective of To C. The big model will definitely run very fast in the C-end scenario, because the trial scope is very wide, such as chat and emotional companionship apps, which are now running very fast.

Secondly, we don’t just look at Killer Apps, because there are many productivity scenarios on the enterprise side, such as AI customer service. In these scenarios, the concept of Killer Apps does not apply, and we do not discuss them with indicators such as DAU.

Enterprise services don’t talk about Killer Apps. ERP (Enterprise Management System Software) is a Killer App, and every enterprise must have it, but no one talks about this concept.

The Emergence of Intelligence:From a global perspective, except for ChatGPT, which has reached the level of Killer App, other products are far behind. For example, emotional companionship and making friends are still mainly targeted at a relatively small group of people, and the degree of homogeneity is still high.

Tan Dai:I think it depends on the time. For example, the Killer App in the early PC era was a search engine, but search engines only appeared later. Before that, there were only portals and e-commerce websites. But after these websites became more numerous, search engines came out.

Even when mobile Internet was just emerging, apps like TikTok and Meituan didn't appear until several years later.

The Emergence of Intelligence:You come from cloud computing. How do you view the popularity of this round of AI technology compared to cloud computing? Will it follow the same path as cloud computing?

Tan Dai:Cloud computing is a complicated thing to understand, even if you are a technical person. It is not something that can be easily set up by one person. I worked at Alibaba from 2010 to 2011. At that time, AWS was probably the only company in the world that had figured out what cloud was. Google had not figured it out either.

But AI is different. As an individual, you can easily experience its functions and quickly know whether it is good or bad. If you want to know what AI is, just download Doubao. From this perspective, it is not like cloud computing, which requires too complicated preaching.

The Emergence of Intelligence:Will such a popularization path affect the decision-making and purchasing logic of To B business?

Tan Dai:Every business has inertia, and the To B business will become slower. But at least when you understand whether the product is good or not, the customer experience will change.

One big change brought by AI is that it makes To B business visible and accessible, and the time for POC (prototype verification) will be much shorter. In the past, decision-making and use were separated, and decision-makers made choices, but users would think it was too bad.

But AI has smoothed out these gaps. In the past, customers had to watch PPTs, visit, and do interviews. In the end, decision makers could only watch demos, and the experience was not so complete. Now when I talk to customers, they will tell us about the Doubao they use on a daily basis. If they want to adjust something in the demo, they can just change it in the background and it will be adjusted immediately. Even pre-sales can do it themselves, without the need for pre-sales to come back to R&D to make changes.

This is also why our To C and To B are both the same brand, Doubao.

The Emergence of Intelligence:As far as large domestic models are concerned, do you think there is a lot of differentiation now?

Tan Dai:There is a huge difference in price.

The Emergence of Intelligence:Didn’t everyone drop a lot?

Tan Dai:Domestic manufacturers have not followed suit. Our strongest model has also been reduced to 0.0008 yuan/1,000 tokens, but most manufacturers have not reduced the strongest model, but have reduced the price of the second strongest model to a lower level, or have made a small model free. If you look at the price of some of our competitors' main models, the price difference is at least dozens of times that of Doubao.

Using an open source model to measure costs is also more expensive than using Doubao. If you use the open source Llama, you have to do the engineering optimization yourself, and there is no scale advantage. With the same effect, doing engineering optimization yourself is several times more expensive than using it directly on the cloud.

The Emergence of Intelligence:Besides price, what are the other differences?

Tan Dai:The model is still evolving rapidly, and many capabilities have not yet been developed. There will still be many differences in the future, and it will get better and better. But looking at the world, there may be only three or four companies that do well, at least not as many as the ten in China.

The Emergence of Intelligence:What industries are Volcano's customers mainly concentrated in now?

Tan Dai:We have experience in all industries. The relatively successful mobile phone manufacturers include Samsung, Xiaomi, OPPO, vivo, and Honor. We have also done a lot with automobile manufacturers, all of which are large companies. We also have some cases in finance and banking.

But the approach angles are different. For example, for mobile phones and cars, we are working on relatively complete human-computer interaction scenarios. But many large state-owned enterprises or banks may try a small point first.

So the advantage of doing AI is that you can start little by little, and there is no need to cut the core system right away.

The Emergence of Intelligence:Are the customer profiles of various large model platforms very different? Many of the large model developers are cloud vendors, and the customer groups of cloud vendors also vary by industry.

Tan Dai:There are differences when the large model business is small, but there is no difference when the scale is large. There are definitely more small and medium-sized customers now, but if we look at the industry distribution, there may not necessarily be a particularly big difference.

The Emergence of Intelligence:Between the C-end and the B-end, where do you think the big model will take off first?

Tan Dai:We are now about the same volume, but the Matthew effect on the C-end is very strong, and the top few customers may contribute a lot of DAU. The C-end is the same logic, even before AI, but the B-end development is very long-term.

The Emergence of Intelligence:What big models can do now is relatively simple. For example, everyone feels that the definition of Agent is not even aligned. Why can the execution of big models only stay in such simple scenarios?

Tan Dai:The model capabilities are not strong enough.

The Emergence of Intelligence:What are the key areas to break through?

Tan Dai:A lot of them, strong modeling capabilities and strong intelligence. It was the college entrance examination a while ago, and Doubao finally "passed" the first-class liberal arts university, but he didn't get into Tsinghua University or Peking University, and he didn't even get into the first-class science university. This is very obvious, and his level is not good enough. But no problem, we have enough confidence.

Second, there are still many complex problems that have not been solved, such as long-term memory. This definitely requires some innovations in model structure, and multimodality needs to be improved. Costs must also be controlled. After adding these capabilities, the cost cannot increase too much.

I think the big model is still in its early stages. The mobile communications revolution has been going on for 30 or 40 years, starting in the 1970s and 1980s. The development history of AI is even longer than that, and we have only been doing it for two years. When we were still using mobile phones in the 1990s, could we have imagined the iPhone today? These are all changes over decades.

The Emergence of Intelligence:Does the current business volume of Volcano Model meet your expectations in terms of both customer volume and revenue?

Tan Dai:I think it is OK and has met our expectations. We hope to see a prosperous ecosystem, and now we have achieved the effect we want. And not only are we growing, we also see that our competitors are growing.

The Emergence of Intelligence:Do you have an estimated amount in mind?

Tan Dai:For example, there is a goal for the total number of tokens, both in terms of total volume and stratification, such as the number of users with more than a certain number of tokens to reach a certain level. We hope that customers will form a spindle-shaped or funnel-shaped distribution. If the total number of tokens is high, but there are only ten customers with more than 100 million tokens, that is not healthy either.

The Emergence of Intelligence:What is the shape now?

Tan Dai:It has not yet reached the shape of an inverted pyramid, and the waist area can be a little thicker.

The Emergence of Intelligence:Obviously, after the price of large models has dropped, AI has become more affordable and can do more things. Do you think there is a clear trend of "one-person companies" among early adopters?

Tan Dai:Nowadays, there are fewer "one-person companies" and more "ten-person companies". I have seen overseas companies where two or three people can do a lot of things.

We used to joke that all a startup needs is a programmer, but now that’s no longer necessary. Some users we interviewed said, “I don’t have code, and I can’t do a prototype verification (POC) with this demo, but now it’s possible. This is a breakthrough from 0 to 1. In the future, if the big model is better, maybe everything from 0 to 100 can be solved with AI.”

There was a discussion at OpenAI about when a unicorn company (valued at more than 1 billion US dollars) with only one person would appear, whether it would be five years or longer.

In this case, we should not only look at Killer App, because many ideas of startups are to solve very vertical problems.

Those who claim that “AI can help you grow your business” are liars

The Emergence of Intelligence:Are you focusing more on technology or customers now?

Tan Dai:Almost. In the short term, the two sides are inseparable. AI is not as mature as cloud computing. There needs to be a feedback process for AI model improvement, product improvement, and customer use. You can’t just sit in the office and watch.

The Emergence of Intelligence:What aspects do customers’ current needs or doubts focus on most?

Tan Dai:There are many unexpected ways.

For example, in the education scenario, some people used to think that it was enough for me to solve the problem for you. But now in many scenarios, customers need you to play the role of a teacher and tell me how to do this, not just tell me the answer. This requires more than just model capabilities.

The Emergence of Intelligence:Do you think this is a product-level issue?

Tan Dai:You may think it is a product or demand issue, but there are also technical issues behind it.

The Emergence of Intelligence:On the customer side, are there many "must have" scenarios now? For example, nowadays, companies think AI is something that can be dispensable, or they may consider cost reduction and efficiency improvement.

Tan Dai:Everyone has reached a consensus that AI is the Next Big Thing, so they will definitely not miss it. Now we no longer need to educate companies to use AI, but you need to discuss with them which scenarios are suitable for using AI. Sometimes people underestimate the capabilities of AI, and sometimes they overestimate it.

The Emergence of Intelligence:Therefore, compared with the previous digital era, AI is equivalent to bringing a new upgrade to the concept of digitalization.

Tan Dai:Maybe. In the past, a very important step in digitization was to turn unstructured data into structured data and make people understand structured data. Now AI can solve all these problems, and the threshold for digitization has been lowered.

The Emergence of Intelligence:Do companies view AI and digitalization from the same perspective? In the past, when companies did digitalization, the bosses thought that it might improve efficiency a little, but many still regarded it as a cost. When companies asked about growth through digitalization, its contribution was actually very weak. Can AI change this situation?

Tan Dai:I think it is a decision of different dimensions. Technology is just about making the business better. On the one hand, the boss needs to know how to look at my business model, and on the other hand, how to improve my business efficiency through digitalization.

If a retailer asks me how to achieve business growth, the first thing I would definitely not tell him is to use AI. I would definitely say: start with Douyin e-commerce. Then we can tell you where we can help you improve efficiency through AI. If someone says "you can grow by using AI" right away, I think he is a liar.

The Emergence of Intelligence:I have a friend who used to work hard selling clouds at Huoshan, but he left around 2022. After seeing the big model, he felt that this gave Huoshan a great opportunity. What do you think?

Tan Dai:We will launch the cloud at the end of 2021. Of course, the first year is not easy, it is the most difficult, he just needs to stick with it for another year. In fact, our growth in the past few years has been quite fast, the fastest in the industry.

I think heroes are made by the times. If there are no new customers and customers have no new scenarios, what is the point of being the best? The previous golden age of cloud computing was due to the rise of mobile Internet. After that, the digitalization of various industries gradually matured, and cloud computing did not grow so fast. This is the natural law of industry development.

But the next era is AI. I think every ten years or more, there will be a new point, and we still need to seize this new point to maintain our technological leadership.

The Emergence of Intelligence:Do you think the developers you come into contact with now are anxious?

Tan Dai:I personally feel that no one is particularly anxious. Why? You see, the current model is pretty good, and the price is so low, so just try more. There is nothing to be anxious about. Maybe the investors are more anxious. (Laughs)

The Emergence of Intelligence:Some developers are anxious because, although the model is cheap now, it is pretty good at the beginning and they can still make money, but once they start to expand the scale and invest in traffic, the ROI will not be positive and there will be few users who actually retain.

Tan Dai:I think it's a traffic problem, not an AI problem. If you don't use AI, you won't be able to invest properly if you switch to something else. For example, if a skit is very popular, some skits can make money, while others can't, which is also normal.

The Emergence of Intelligence:What is the most interesting AI application scenario you have seen recently?

Tan Dai:There are many, for example, some children use Doubao to learn English. Another corporate client of ours wanted to use the big model, so he held an AI hackathon within the company. Employees developed more than 100 products based on the business needs they encountered, and maybe two or three of them were successful.

We are now working with some car manufacturers, and they even invite their own users to participate in the entire design. When users participate in the design, they may be able to better understand their own pain points.

The Emergence of Intelligence:How much has the ratio of training and inference computing power changed for the companies you serve?

Tan Dai:Training will still be more, but reasoning is also growing very fast, dozens of times higher than last year, while training is less than twice as high. We originally predicted that reasoning would exceed training in 2025, and now it seems to be true.

The Emergence of Intelligence:When do you think AI applications will have a big explosion?

Tan Dai:First of all, this year is not a big explosion, but it is definitely a small explosion, but I think there will be a big explosion next year. Even from a global perspective, we are definitely still in the early stages of AI applications. In the future, when there are more affordable prices and the model effects are ideal, AI applications will flourish, chemical reactions will continue, and the ecosystem can be built.

The Emergence of Intelligence:Since 2023, you have been emphasizing that you will not make basic large models. Will you do it in the future?

Tan Dai:The Doubao big model is made by a special team of ByteDance. Huoshan does not need to make its own model, it only needs to do cloud and MaaS.

The people who work on cloud computing and big models are definitely completely different. Global giants do it separately. Amazon's model is Claude, and the cloud is done by AWS; Google Cloud and Gemini are not done by the same group of people.

We at Huoshan are focused on doing cloud business well, but MaaS is a very important part of cloud. Once MaaS is done, our MaaS service will provide the best model to our customers, and this has never changed.

Welcome to join the group

Welcome to communicate