news

Kuaishou's "KeLing" unexpectedly became popular, ByteDance urgently caught up, and the competition in the AI ​​literary video track intensified

2024-07-31

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina



In February 2024, the sudden appearance of Sora prevented many large model companies from having a good Spring Festival.

"The Spring Festival Gala is still being replayed, and we are urgently discussing it in a group," an employee of a leading AI company told "Shijie" eagerly. Seeing Sora's smooth experience, even bloggers who sell courses are flocking to it, rushing to post tutorials online to make a fortune.

As for who can "copy" the domestic version of Sora the fastest, more people put their eyes on Alibaba, Baidu, and the "Big Five". But no one expected that the winner would be Kuaishou, which has a relatively "Buddhist" technical performance.

On June 7, Kuaishou suddenly launched the Wensheng video model "Kling", which can support the generation of videos up to 2 minutes long. In addition, compared with Sora, which is still in the "futures" stage, Kling was opened for testing as soon as it was announced, and the generation effect is also remarkable.

"Keling is the most discussed subject in the circle recently," a venture capital industry insider told "Shijie". According to official data: Keling has received over 500,000 applications within one month of its launch, and has been opened to over 300,000 users, generating over 7 million short videos.

Keling's unexpected popularity inevitably made ByteDance somewhat embarrassed. In May this year, ByteDance also opened the test of the Vincent video model "Jimeng", but the effect is not yet obvious in the currently popular Vincent video track.

Suddenly being left behind by its once "ignored" rival, ByteDance needs to focus on catching up. According to Titanium Media, ByteDance has recently set AI big models as the group's "highest level P0". Douyin, Jianying and other teams are also working hard to develop AI video model applications, which are expected to be announced in the near future.

1. Do it smartly, quickly, roughly and fiercely

Many developers told "Shijie" that the launch and outstanding performance of Keling was a surprise in the industry.

Recently, Shijie used the keyword "black cat" and entered the same prompt in Keling, Jimeng, and Zhipu AI's newly launched "Qingying": "The city streets are empty on a rainy day, and a cute black cat runs by. Its eyes are green, and it has a yellow collar and bell on its neck, and its body is covered with long black hair. The video was shot from the perspective of a camera, and the puddles on the ground reflect the silhouette of the black cat."

Among the three videos generated, although the Keling version failed to achieve the effect of fast running, the videos generally conform to objective laws.

In contrast, the "Dream" version had no water on the ground and the black cat did not move forward. Although there was water on the "Clear Shadow" version, the black cat walked with a strange gait and its tail also dropped frames.

▲ (The videos are works generated by Keling, Jimeng and Qingying respectively)

According to Silicon Star, Kuaishou spent three months to build Keling; the team is very small, with only more than 20 people, led by Wan Pengfei, the current head of Kuaishou's Visual Generation and Interaction Center. Most of his research directions are image/video signal processing, computational photography and computer vision, reducing loss functions, visual generation, etc.

Keling's predecessor came from an inconspicuous project "Puji" that Kuaishou restarted in October 2023. This is a tool software that generates 2s GIF emoticons from static images through AI. In early March of this year, Kuaishou held a small meeting internally, and Wan Pengfei's idea was affirmed by Kuaishou's senior vice president Gai Kun (Yu Yue), who quickly decided to use Puji as a pre-research product.

According to Silicon Star, "When we were working on KeLing, there was a consensus on execution, which was to be fast, rough, and fierce."

Less than a month after the Keling project started, it received support from Kuaishou founder Cheng Yixiao, who regarded it as a strategic project of the company. Gai Kun also often said: All the company cards are available for you to use, and the company fully supports it.

Ke Ran, an entrepreneur in the digital human sector, told Shijie: "Keling's success is largely due to the video data materials accumulated by Kuaishou. Looking at the domestic market, the only company that can compete with it in this regard is Douyin."

While Keling is enjoying its glory, ByteDance seems a little lonely.

Although "Jimeng" was officially launched on May 9, and on June 17, Jimeng also appeared as the chief AI technical supporter in the AIGC short series "Sanxingdui: Future Revelation", Jimeng's voice is not very loud, whether in terms of C-end performance or compared with the AIGC short series "Shanhai Qijing" launched by Kuaishou on July 13.

On July 17, the market reported that ByteDance would announce the progress of its Sora-like Vincent video technology. The outside world also interpreted this as ByteDance's attempt to catch up and face off against Keling head-on.

However, ByteDance later told Shijie that the news was not accurate. On July 17, Shijie noticed that the event was more like a technology sharing meeting. The meeting was mainly hosted by Feng Jiashi, the head of the Doubao large model visual basic research team, and ByteDance research scientists, institutional scholars, etc. gave a full English technology sharing.

It seems that ByteDance’s “big move” may have to wait for some time.

2. ByteDance has not recovered yet

So, why did ByteDance miss the feast in the recently booming Vincent video track? What has ByteDance been busy with recently?

To some extent, perhaps because compared to Kuaishou’s bet on “Keling”, which can “defeat ten with one’s strength”, ByteDance’s big model layout is more complicated - and in the first half of this year, ByteDance’s more important opponents are Tencent and Alibaba.

In the face of large models, ByteDance's pace can be described as "radical". After all, more than two months ago, it was ByteDance that first launched a large model price war in the industry.

On May 15, at the ByteDance "FORCE Power Conference", ByteDance launched an API service based on its self-developed Doubao model. At the same time, Tan Dai, president of Volcano Engine, revealed the latest price of "Doubao": 0.0008 yuan/1,000 tokens, announcing that this is the "floor price" that is lower than 99.3% of the industry.

At that time, ByteDance's "attack" had taken the lead. According to information obtained by "Shijie" from various sources, the leading players were not prepared for ByteDance's attack; although all parties felt helpless, they could only passively follow.

In the next few days, Alibaba Cloud, Baidu Wenxin Big Model, and Tencent Cloud successively announced that they would significantly reduce the price of their big model inference input tokens and APIs. Under this influence, the C-end calls of the top big models are now almost all free, and the industry is also beginning to "roll" towards the next ecological level.

According to the founder of a legal AI application company, Volcano Engine sales staff began to actively contact customers and promote products almost immediately after the API service was opened. This also indirectly confirms the speculation in the market that ByteDance has marked the big model as the highest level of strategy.



▲ (Tan Dai at the 2024 "FOECE Conference". Image source: Volcano Engine)

Recently, ByteDance’s “signature product” Doubao has grown significantly.

According to Questmobile data, as of June 2024, among the domestic AIGC apps, Doubao, Tiangong, Kimi Smart Assistant, and Maobox have achieved impressive growth - among which Doubao ranks first in traffic.



▲ (Photo source/QuestMobile)

Compared with Kuaishou, what ByteDance cares more about now may be the full ecological competition from basic large models to AI application layers. In addition, considering that Volcano Engine, which only officially started cloud computing in 2021, is the "youngest" among the giant cloud vendors. For more than three years, Volcano Cloud has also been regarded as a challenger in the cloud market. How ByteDance can coordinate the basic large models, application layer, and cloud market is a comprehensive proposition.

Recently, according to Photon Planet, a large number of users of ByteDance's "Button" platform are seeking how to connect the created intelligent entities and bots to WeChat official accounts or mini-programs, and the discussion is very active.

In December last year, ByteDance launched the AI ​​application development platform "coze" overseas. In February this year, the domestic version of "button" was launched. A large number of Douyin system merchants also hope to quickly dig a pot of gold from it.

Considering that Tencent only launched the AI ​​agent creation and distribution platform "Tencent Yuanqi" in May this year, the number of visits to the button had reached 2.33 million. As of now, Tencent Yuanqi has not yet connected the WeChat series ecosystem of mini programs, public accounts, and customer service subscription accounts.

After all, AI is still in its early stages. ByteDance, like Tencent, still needs to spend a lot of time educating users. Competing for distribution rights in the AI ​​era and taking the lead may be a bigger task for ByteDance to target Tencent.

3. There is still time to strike back

From an industry perspective, in today's Internet, ByteDance has no shortage of content traffic, e-commerce traffic, and financial ammunition. Even if it is "one step behind" in Wensheng Video in the short term, it still has the potential to catch up in the long run.

ByteDance is also good at using aggressive market strategies to catch up with its lagging position and create miracles through hard work.

Recently, ByteDance is also working on integrating big models, targeting Alibaba. At the DingTalk Ecosystem Conference held on June 26, President Ye Jun announced that in addition to Alibaba's own Tongyi, the remaining six third-party big models will be installed in DingTalk. These include MiniMax, Dark Side of the Moon, Zhipu AI, Orion Starry Sky, Zero One Everything and Baichuan Intelligence, covering almost all well-known big model startups in China. It is self-evident that they want to "build China's most open AI ecosystem."

Similar to DingTalk, ByteDance's Kouzi platform not only supports its own "Doubao", but also connects to major external models such as Tongyi Qianwen, Dark Side of the Moon, and MiniMax. On June 14, Kouzi also launched the "Model Square" function, which allows users to select two anonymous models and score them based on the performance of the generated content.



In addition, ByteDance was recently exposed to be accelerating its exploration of "AI+hardware" and is not hesitating to attract talent through acquisitions.

According to Tech Planet, ByteDance's PICO has been developing multiple wearable devices, including headphones and speakers, since the second half of last year. These devices will also be equipped with AI. The ByteDance Doubao team has also explored the combination of software and hardware based on large models. The combination of software and hardware has gradually been applied to hardware devices such as learning machines, robot dogs, and robots.

According to 36Kr, the person in charge of ByteDance's AI hardware "D line" is Li Haoqian, who is the founder of Oladance, an OWS (Open Wearable Stereo) headphone brand acquired by ByteDance in March this year. The person in charge of another AI hardware line, "O line", is also the founder of a company acquired by ByteDance, and reports to Hong Dingkun, ByteDance's vice president of technology.

As for the Vincent video sector, given the recent popularity of the track, the followers, including ByteDance, still have time.

Recently, a developer told "Shijie": "Right now, I am using Kelinggou to compose pictures and reduce the workload of the workflow. I have not yet reached the stage of using it completely for creation, so there is no dependence on it yet."

In the eyes of another developer, short video AIGC blogger, Keling still has a lot of room for improvement: "Relying on Keling's text to generate videos cannot guarantee the consistency of the virtual human IP. I usually use Keling's image-generating video function, which is equivalent to giving Keling a picture to generate dynamic videos from different perspectives on this basis, and then splicing them together to simulate the effect of camera movement. In fact, human operation still accounts for a larger proportion."

A research and development member of a domestic AI dating simulation product said: "In the current large-scale model application market, everyone is crossing the river by feeling the stones. How to commercialize is a question that is too far away and too vague. But what is certain is that the more people use and play with it, the more it can ensure the optimization and iteration of the product."

(Ke Ran is a pseudonym)

Author | Dong Wenshu

Editor | Li Yuan

Operations | Liu Shan