news

"Price Butcher" DeepSeek has once again started a price war for large models. Will anyone follow suit this time?

2024-08-05

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Interface News reporter | Chen Zhenfang

Jiemian News Editor | Song Jianan

The large-model "price butcher" DeepSeek has once again launched a big price cut.

Recently, the company announced that its API input fee has been adjusted to 0.1 yuan/million tokens and output fee to 2 yuan/million tokens, which means that the price of large model API has been reduced by another order of magnitude.

As for the reason for the price reduction, DeepSeek explained that in the use scenarios of large model APIs, a considerable proportion of user input is repeated. For example, the user's prompt words often have some repeated references, or in multiple rounds of conversation, each round has to repeat the content of the previous rounds.

To address the above issues, DeepSeek uses contextual hard disk caching technology to cache content that is expected to be reused in the future in a distributed hard disk array. If the input is repeated, the repeated part only needs to be read from the cache without calculation. This is also the reason for the price reduction of the large model this time.

DeepSeek pointed out that contextual hard disk caching technology can not only reduce service latency, but also significantly reduce the final cost of use.

DeepSeek is also the world's first large-model manufacturer to widely use hard disk caching in API services. This is mainly due to the MLA structure proposed by DeepSeekV2 - while improving the model effect, it greatly compresses the size of the context KVCache, greatly reducing the transmission bandwidth and storage capacity required for storage, so that it can be cached on low-cost hard disks.

In addition, the DeepSeekAPI service is designed with a capacity of 1 trillion per day, with no flow or concurrency limits for users.

This is not the first time the company has cut prices. Since May this year, the disruptor DeepSeek has taken the lead in launching an API price war.

As early as April 25, DeepSeek priced its API at 1 yuan per million input tokens and 2 yuan per million output tokens. On May 6, DeepSeek released the open sourceMoEThe model has lower parameters and stronger capabilities. The API price is reduced to 1 yuan/million input tokens and 2 yuan/million output tokens. The price is aboutGPT 4 One percent of Turbo.

This price reduction action quickly triggered a response from the entire industry, with Zhipu AI, Volcano Engine, Baidu, Tencent, Alibaba Cloud and others announcing price cuts.

Among them, Alibaba Cloud announced that the price of Tongyi Qianwen's core model Qwen-Long was reduced by 97%, and the price was only 0.0005 yuan/thousand tokens after the price reduction. Baidu and Tencent successively announced that some large models would be free.

Overseas,OpenAIAfter the release of GPT 4o, it was announced that it would be free to use and the price of API calls would be halved.

It is worth noting that at an event held by Volcano Engine on May 15, Tan Dai, president of Volcano Engine, announced that the price of Doubao universal model pro-32k is only 0.0008 yuan/thousand tokens. The price of the same specification model on the market is generally 0.12 yuan/thousand tokens, which is 150 times the price of Doubao model. The pricing of Doubao model is 99.3% cheaper than the industry, driving the price of large models into the "centimeter era".

Tan Dai pointed out that reducing costs is one of the key factors in driving large models to quickly advance to the "value creation stage", and the large model volume price will help companies accelerate business innovation at a lower cost.

At that time, an insider of Volcano Engine told Jiemian News: "The real reason for the price reduction of Doubao large models is that the application of large models on the enterprise side has not yet developed and there are too few scenarios." He pointed out that although the industry is discussing the use of AI large models to reconstruct the business, the implementation of large model capabilities is rarely felt in daily work and life. "The price reduction is essentially to lower the threshold for use."

In terms of the reduction, the price reduction of input is generally higher than that of output. Most of the price-reduced products are lightweight model versions, which are only suitable for short-term use by small and medium-sized enterprises and individual developers with low frequency, small amount of inference and simple tasks.

Overall, big models are still in the market cultivation stage. At present, API price reduction is more of a customer acquisition strategy for big model manufacturers, so as to allow more companies to access their business scenarios, promote the application of big models in various industries, and further accelerate commercialization. This measure helps to attract developers and partners, quickly establish an ecosystem, and also provides a broader space for innovative applications in various fields.

Lowering prices or making them free is to enable more companies and developers to quickly use large models. After all, getting more people involved is a prerequisite for the development of the industry.

However, it is obviously difficult to complete the closed loop of big model commercialization by relying solely on API business. "No big model company survives by selling APIs." A FA (financial advisor) who focuses on the big model industry told Jiemian News reporters.

Fu Sheng, chairman and CEO of Cheetah Mobile, also believes that the big price cuts basically mean that big model startups must find a new business model. The companies that cut the most are large cloud service companies that use big models to acquire cloud customers. "The wool comes from the pig, so they can afford the price cuts." Big model startups do not have such an ecosystem and must find another business model.

Unlike the first round of price cuts, in the face of the price war launched by DeepSeek, a number of large model companies have not yet followed up and made few comments. However, the price cut again shows that the era of universal access to large models is coming, and the vertical application ecosystem is expected to prosper further.