news

Can't wait for Sora's friends to rush to Kuaishou Keling

2024-08-01

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina


Why is Kuaishou the first company to launch AI videos?

Text | Chen Meixi

Editor | Director

In the early days when Open AI made the big model storm sweep the world, Kuaishou was not a dazzling role. At that time, Baidu came up with Wenxin Yiyan, Alibaba came up with Tongyi Qianwen, and Tencent came up with Hunyuan Big Model, catching up with each other, and the speed of R&D and opening up was very fast.

Kuaishou was not on the initial list of contenders, and even now, few people can name Kuaishou’s large language model: Kuaiyi.

The situation changed on June 6 this year, when Kuaishou's video generation model Keling was open for public beta testing. That day was also Kuaishou's 13th anniversary. The media often compares Keling with Open AI's Sora, but after its explosive debut in February, Sora has not been open to the public. As a result, curious Chinese users began to flock to Keling.

Ten days later, Keling "had received tens of thousands of applications," according to Wan Pengfei, who attended the Zhiyuan Conference that day.During the speech, his title was the head of Kuaishou's visual generation and interaction center, and he was also the actual head of Keling. By July 19, the number of people who applied to use it exceeded one million. It took Keling only more than a month to increase the number of applications from tens of thousands to millions. Among the large-scale model progress of various large companies in the first half of the year, this is the most eye-catching performance.


A picture generated by Ke Ling, with the prompt: bees in the flowers

The story of a comeback is always popular with the public. But a careful review of Keling and the Kuaishou big model business behind it reveals that this is not a cool drama of a low-key dormant and then suddenly emerging, nor is it a heroic narrative of a genius boy leading a small team to strike back.

A video generated based on the image generated by Keling. The prompt is: bees flying around.

Why is Kuaishou the first company to launch AI videos?It is the business needs that bring strong motivation and high priority, it is the data and technology accumulation in the product form that bring the iteration speed, and it is the matching of business scenarios that brings real customers.

The emergence of Keling is not an isolated incident. In China, Kuaishou’s biggest rival in the field of AI video is likely to come from ByteDance.


Where there is demand, there is motivation

Creators are one of the most important links in Kuaishou's content ecosystem. According to official data released by Kuaishou, in 2023, there were 138 million creators who released short videos on Kuaishou for the first time, and the videos released throughout the year received more than 1 trillion likes on the platform.

Having a large number of content creators means that Kuaishou needs to meet their needs for content tools, otherwise creators chasing emerging productivity will soon turn to other tools or even other platforms. This is also the value of Kuaishou and Jianying to ByteDance.

Therefore, Kuaishou’s previous investment in large language models was not aggressive, but it is striving to be at the forefront in the competition for multimodal large models.

Before Keling, Kuaishou's self-developed large-scale model capabilities for text and images had been connected to Kuaishou. The results of the internal test showed that users generated an average of more than 500 million AI images in the comment area per month.


A picture generated by Ke Ling, with the prompt: Aliens standing by the Yangtze River

The strong motivation brought by demand is certainly a necessary condition for Keling to emerge, but Kuaishou is not the only one that has these conditions. If there is any other important factor that affects the emergence of Keling, it may be determination.

The determination first came from Kuaishou’s top management.

Previously, Kuaishou was always a little slow in its big model development. When it comes to domestic big language models, people will first think of Wenxin Yiyan and Tongyi Qianwen; when it comes to big language model applications, Kimi and Doubao are the most popular in the market. In the field of literary pictures, the first to break out in the domestic market was SenseTime's RiRiXin 5.0. On the morning of the second day after its release, SenseTime's stock price rose by more than 30% and was temporarily suspended.

In the previous series of stories, Kuaishou was in a seemingly marginal position. The video big model and its application are the most important nodes that cannot be missed in the competition for the outlet, and are also the most important part of the overall big model layout.

In a previous report by the tech media Silicon Star, a technician from the Keling team said: "Gai Kun often said that all the company cards are available to you and the company fully supports you." Gai Kun is Yu Yue, senior vice president of Kuaishou, who is in charge of the main site business and also the social science line. After former CTO Chen Dingjia stepped down, Gai Kun became one of the top leaders of Kuaishou's technology line.

Wan Pengfei and his team may have even greater determination.One detail is that after deciding to take over the Keling project, Wan Pengfei handed over the original work of taking over the business needs of all parties to other persons in charge of the same level in Zhang Di's team.Correspondingly, other members of the Keling team also handed over their original businesses and devoted themselves to the research and development of Keling. Working overtime on weekends to catch up with the progress was a normal part of their work.

A video further generated based on the images generated by Keling. The prompt words are: Aliens walking in the water, two aliens high-fiving

"In fact, more than a month before the official release, the test results of Keling were not very good." A large model industry practitioner told Hedgehog Commune, "Lao Wan and his team fought a last-ditch effort. Later, it was really 'great effort to make a miracle'. Many people did not expect such an effect in the end."

So determination became the last variable.


Accumulation leads to speed

Kuaishou, or any leading short video platform, has accumulated experience in the research and development of AI videos in two aspects: content and technology.

Videos are produced by users, and the platform labels and understands them, and then selects the content that can be distributed. This is the inherent route in Kuaishou's business logic. The processed content becomes a kind of data. From the description of the content itself to the popularity after distribution, Kuaishou has mastered a huge amount of content data.In simple terms, they not only have the content, but also know which is the "good content" that users like to watch.

For training large generative video models, this process is like preparing the dishes in advance.

Even the "cook" is ready-made.

Most of the core R&D members of the Keling project team are Wan Pengfei’s old subordinates from the Y-tech period. At the beginning of the year, the team members gradually came into contact with relevant information and materials and began to work on Keling’s R&D.

Previously, Wan Pengfei's team was mainly responsible for taking on the UGC intelligent creation needs under the Kuaishou creative ecosystem, and worked closely with the main site production, Kuaishou, Yitian Camera and other business parties. The product forms included portrait beautification, audio and video special effects, live broadcast virtual images, etc.

In 2021, Wan Pengfei gave a public speech at the Global Artificial Intelligence Technology Conference as the "Head of AI Technology Platform of Y-tech Department".Among the cases shared at that time was the “live photo effects” – a way of creating videos from images in the pre-big model era.That year, the template library of Kuaishou and Kuaiying launched the "animated old photo" special effect. After a user uploads a photo, the person in the photo can smile, blink, nod, and other actions to form a video effect. It is reported that this dynamic special effect has been used more than 3.44 million times on Kuaiying.


Kuaishou user @森渚和鹿 released a moving old photo video in 2021

In 2021, Wan Pengfei is very confident about the development of generative technology, and proposed that "generative models will become more and more powerful, the generated content will be more realistic, and the generation process will be more stable and controllable."

Three years later, Keling once again became popular with the work of "reviving old photos". Some users who applied for permission to use it used the "photo to video" function to turn photos of deceased relatives into videos. Compared with the "live photo special effects" three years ago, through open commands, users can make the characters in the photos perform more complex actions, which is the change brought about by the new large model technology.

Sora was released in February this year, and the Kuaishou Keling team began to be formed at a similar time, but the research and development and application of multimodal driven video generation technology has always been within the scope of work of Wan Pengfei and his team.

The above-mentioned practitioner expressed a similar view to Hedgehog Community: "In fact, the emergence of Sora is equivalent to allowing everyone to determine the technical route or plan at that point, but many applications of visual technology, including multimodal things, are what they have been doing all along."

This is where the technical accumulation that Keling needs comes from. The chef encountered a prepared dish, and after the whole world saw a new recipe, Kuaishou became one of the teams that cooked the finished product the fastest.

But at the same time, the above-mentioned practitioners are also waiting to see whether Keling can maintain its leading position in technology and products in the long run. "The algorithms used by everyone are similar now, and each company may make some minor adjustments, but they are basically the same." In his opinion, with cards, data, and similar algorithms, it is only a matter of time before each company produces products with similar effects.

Therefore, speed is only a temporary advantage of Kuaishou. They need to truly transform the advantage into stable customers during the leading window period.


There are customers only when there are scenarios

On December 29, 2020, Kuaishou officially released the "Kuaishou School" as its corporate value for the first time in an internal letter to all employees, and stated that "customer obsession" is the core of Kuaishou's values.

Previously, whether it was Kuaishou or other Internet giants, the more commonly mentioned concept was "user". For this reason, Kuaishou specifically explained the reason for this vocabulary change. "The complexity of the company's business has increased, expanding from 'users' to 'customers'. Customers include producers and consumers, B-end customers and C-end users, external customers and internal customers. We need to strengthen our understanding and cognition of producers and B-end customers, and we also need to emphasize our service awareness for internal customers."

Looking back from 2024, that was indeed an important watershed in the change of Kuaishou's business structure. In 2020, the proportion of live broadcast revenue in Kuaishou's annual revenue dropped from 80.4% in the previous year to 56.5%, the proportion of online marketing service revenue increased from 19% to 37.2%, and the proportion of other service revenue, including e-commerce, quickly climbed from 1% in the previous year to 6.3%.

As Kuaishou itself has explained, producers and consumers, B-end customers and C-end users, external customers and internal customers are all important customers of Kuaishou and will also become the target customers of Kuaishou's big model.

Producers and consumers jointly build Kuaishou’s most basic business chain. Producers produce content and consumers consume content.The value of large-model products lies in lowering the production threshold while improving content quality.

The former is easy to understand. As Keling can currently achieve, by inputting text or pictures, you can get a video. For people who do not have the conditions for shooting and production capabilities, the production threshold is greatly lowered.

The latter sounds counterintuitive at first - with the current simulation capabilities and output length of AI video products, how can the quality be better than real shooting and professional production? However, on leading platforms such as Kuaishou and Douyin, most of the tens of millions of short videos produced every day come from ordinary users. The AI ​​technology that is "not sophisticated enough" in the eyes of professional content producers is enough to add material and richness to the casual videos of ordinary users.

B-side merchants may also become customers of Kuaishou's large model capabilities. According to the data released by Kuaishou at the World Artificial Intelligence Conference this year, its AI advertising revenue has exceeded 20 million per day. AI-generated advertising materials can reduce the cost of a single piece to 0.47 yuan while keeping the CTR at the baseline.


Keling generates an image with the word: Apple juice advertising material. There is a glass of apple juice on the white table with two red apples on the side.

For many large-model products, finding landing scenarios is a difficult problem throughout the product life cycle, but Kuaishou obviously has no shortage of scenarios.As Zhang Di, vice president of Kuaishou and head of big models, said, Keling’s popularity stems from “uncovering real value in real scenarios to meet users’ real needs.”

The challenge they face is how to make users become high-frequency users of Kuaishou's big models and be willing to continue paying for them under the current scenario, and first realize the commercialization of the big models within their own ecosystem.

On July 25, Keling fully opened registration. On the same day, Keling launched its membership system and entered the charging stage.

According to the information on Keling’s official website, non-member users can receive 66 inspiration points when they log in every day. According to the current “price list price”, approximately 6 videos or 330 pictures can be generated for free.

There are two payment modes. One is the membership mode, where users can purchase different levels of membership on a monthly, quarterly, half-year or annual basis. The higher the level, the more works can be generated. The other is the recharge mode, where users can pay directly for recharge. In other words, the cost of generating each video is 1 yuan, and the cost of generating each picture is 2 cents.


A picture generated by Ke Ling, with the prompt word: sunrise, beautiful clouds and morning glow in the sky, and the sun is hidden in the clouds

There are two points worth noting about Keling’s pricing system.

First, Kuaishou does not provide a membership option that allows users to generate works unlimited times, which means that whether users choose the recharge model or the membership model, they are actually "spending money to buy times."The only difference is the unit price for each generation, as well as differentiated functions such as watermark removal, video extension, and master camera movement.

The cost of generating AI videos is high, and Kuaishou does not provide memberships that allow unlimited generation. It is obvious that this is an attempt to avoid possible cost spiralling out of control, and to curb gray industries to a certain extent.

Second, the inspiration value, which serves as the "payment currency" of Keling, is priced equivalent to the Kuaishou coins used to reward anchors within the Kuaishou ecosystem.1 RMB can buy 10 Kuaishou coins or 10 inspiration points. This pricing method may be to reserve the possibility of opening up the payment system in the Kuaishou ecosystem in the future.


Kuaishou is not the only company that has the needs, scenarios, customers, equipment, data, technology, and talent reserves.

The above-mentioned industry insiders predict that in the near future, ByteDance will be able to produce generative video products of the same level. Before that, Kuaishou needs to complete the accumulation of users and content within the window period, so that AI content can operate effectively within the Kuaishou ecosystem, and it is best to run through the commercialization path and stay in the leading position for a longer time.

Conventional growth and operation methods have been put on the agenda. Keling’s official website quickly launched a 50% discount event for all members, and users can get 66 inspiration points for logging in every day to reduce the impact of the payment model on user growth and retention, so that all users can at least try it out without any barriers.

In addition, in the operation of Kuaishou, Keling did not overemphasize concepts such as generative videos, diffusion model solutions, and distributed training clusters. Instead, it used gameplay such as "turning old photos into videos", "traveling through time and space to hug you", and "reviving photos from 40 years ago" to attract users to get started first, so as to reduce the user's understanding cost.

For ordinary users, the new gameplay is of the same difficulty and path as the previous Quick Shadow special effects gameplay. They understand it as a more powerful special effect. Whether or not they have heard of the word "big model" does not prevent them from becoming actual users of big model products.

This is exactly the ultimate advantage that Kuaishou and ByteDance have in making AI videos if their users are their own; it is also the ultimate crisis they face in the AI ​​era if their users are attracted away by new disruptive products.

Rather than competing for the entrance to the AI ​​video era, it is better to say that they are all defenders. New productivity will create new content forms and eventually create new platforms. Cheng Yixiao and Zhang Yiming are all too familiar with this story.

They have to be the first team.

(Cover image generated by Keling.)


For media cooperation, please contact WeChat ID: ciweimeijiejun

If you want to communicate with us, you can reply "join the group" in the background to join the community