news

Who can surpass Sora in the actual experience of opening the domestic AI video large model?

2024-08-10

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Cover News reporter Xiong Yingying
At the beginning of this year, the American company OpenAI released the AI ​​video generation model Sora, which was like a thunderbolt from the blue sky, bringing new possibilities for the application of artificial intelligence. At that time, many netizens lamented that the gap between our AI technology and that of foreign countries was getting bigger and bigger.
However, in just half a year, "domestic Sora" such as Keling, PixVerse V2, Qingying, and Vidu have been launched one after another and are free for users to use.
Which domestic video model is the best? While the technology is constantly breaking through, who can take the lead in commercialization?
Four "domestic Sora" were launched within this month
Journalist's actual test experience
While the price war between domestic large manufacturers and large models is in full swing, some companies are focusing on the field of AI video generation. According to incomplete statistics, there are more than 10 domestic AI video large models so far, and in July this year alone, 4 "domestic Soras" were released online.
On July 6, the Kuaishou Keling AI web version was officially launched, providing text-to-video and image-to-video functions, and can generate videos up to 10 seconds long. It also added new functions such as camera control and customized first and last frames.
On July 24, Aishi Technology officially released PixVerse V2, which was released globally. The model can generate multiple video clips at a time, and can generate videos with a single clip of 8 seconds and multiple clips of 40 seconds.
Subsequently, Qingying, created by Zhipu AI, and Vidu, independently developed by Shengshu Technology, were also released. Among them, Qingying focuses on fast generation within 30 seconds; Vidu adds the generation of anime-style video clips in addition to the common realistic style.
The video generation models of the above four companies are now open for testing. After quickly registering through telephone, email, etc., the reporter also had a practical experience.
To test the "image-to-video" feature, the reporter uploaded the same picture of a rose that had not yet bloomed to four large model websites and entered the prompt word "flower blooming". Qingying and Vidu both successfully generated dynamic videos of the rose blooming. In the videos generated by PixVerse and Keling, you can see the flowers swaying, but there is no dynamic effect of "blooming". But when the reporter changed the prompt word to "a flower slowly blooming", Keling also successfully generated a video of the rose blooming. It can be seen that different large models have different abilities to process and understand language.
Screenshots of the video generated by four domestic large models
In terms of the speed of video generation, Vidu took the fastest time, generating a 3-second video in less than 1 minute. The other three large models completed video generation within 5 minutes. Although Qingying advertised "30-second fast generation", perhaps because there were too many people trying it, the generation page showed "estimated waiting time of 3 minutes".
Judging from the feedback from netizens on social platforms, all major models have more or less experienced problems such as character distortion and missing images.
“Many are watching, but few are taking action”
Investment in large-scale models tends to be cautious
When Sora was born earlier this year, there was still a lot of pessimism online, with some people believing that China had been left far behind by the United States in the field of AI. But only half a year later, a number of large AI video models that rivaled Sora emerged in China.
Tianyancha shows that Zhipu AI, founded in 2019, has completed its C round of financing and is currently valued at over 10 billion. Although Aishi Technology and Shengshu Technology were both founded in 2023, they have completed three and four rounds of financing respectively. Does this mean that the investment circle is still enthusiastic about investing in the large model track?
"Basically, there are still many people watching, but very few people taking action." Angel investor and artificial intelligence expert Guo Tao said that the most active investors are still several Internet giants, who have invested extensively in multiple large model projects. On the one hand, large companies can find some scenarios for the application of video large models in their existing businesses; on the other hand, if these large models have a certain degree of overlap with the company's own business, they can be used as a supplement to the product line; large companies can also empower these large model companies to a certain extent through existing ecological resources.
Zhipu AI Company has raised over 100 million yuan in multiple rounds of financing
Overall, domestic investment institutions are still conservative and cautious about large models such as text-generated videos and image-generated videos. The main reason behind this is that their commercialization faces many challenges.
In Guo Tao's opinion, both Sora and domestic large models still have certain defects in the generated videos. For example, AI-generated characters sometimes have an extra finger, and the shot cannot go into the basket, etc. This shows that the large model has insufficient understanding of the spatial relationship of objects, and the algorithm model needs to be further improved.
In addition to technical issues that need to be overcome, the biggest pain point in the commercialization of large AI video models is the lack of mature application scenarios.
"For example, Kuaishou itself has a platform and content, and relatively speaking has certain application scenarios, which may be useful to many self-media people." Guo Tao said, but for some purely technical companies, it is indeed difficult to find a particularly good and rigid demand scenario that users are willing to pay for.
Platform accelerates commercial exploration
The short drama market is expected to be the first to land
Even though there are challenges in commercialization, large model platforms at home and abroad are actively exploring and experimenting with commercialization.
The reporter noticed that at present, domestic AI video large models have also opened C-end payment. On July 24, the official WeChat account of Keling AI revealed that the number of users who have applied for permissions has exceeded 1 million. On the same day, the paid membership system was launched simultaneously, including three membership categories: gold, platinum, and diamond. The annual membership price ranges from more than 500 yuan to more than 5,000 yuan.
PixVerse adopts a subscription payment model, including basic version, standard version and unlimited version, with unit prices ranging from 5 yuan to 60 yuan.
However, many industry insiders said that at present, the computing power cost and customer acquisition cost of large AI models are very high, users' payment habits have not yet been formed, and the market competition is extremely fierce. It is not easy to achieve profitability relying solely on C-end payments.
According to media reports, in June this year, the world-renowned children's toy brand "Toys "R" Us cooperated with OpenAI to use the one-minute commercial film "The Origin of Toys "R" Us" produced by Sora, which further demonstrated the feasibility of Sora to generate commercial advertisements.
In July, the first domestic AIGC original fantasy short drama "The Mirror of Mountains and Seas: Cutting Through the Waves" was officially launched. The whole drama has 5 episodes and a duration of 15 minutes. The angular protagonist boy, the fantastic Kunpeng beast, etc. in the drama are all generated by AI.
As more and more production companies and platforms begin to explore the integration path of "AI+micro-short drama", AI video big models may be the first to start commercial applications in the micro-short drama market.
Report/Feedback