my country's self-developed video model is launched globally

2024-08-02

Enter a text description or upload a picture to generate a realistic video. Recently, my country's independently developed universal video model Vidu (www.vidu.studio) was officially launched globally. It is reported that this video model has two core functions: text-generated video and image-generated video. It provides two duration options of 4 seconds and 8 seconds, with a resolution of up to 1080P. It takes only 30 seconds to generate a 4-second video clip.

The picture shows the video screen generated by Vidu based on the text description. (Photo provided by the interviewee)

Vidu was jointly developed by Tsinghua University and Beijing Shengshu Technology Co., Ltd. and was first released at the 2024 Zhongguancun Forum Annual Conference in April this year. Zhu Jun, deputy dean of the Institute of Artificial Intelligence at Tsinghua University and chief scientist of Beijing Shengshu Technology, introduced that Vidu has the characteristics of "long duration, high consistency, and high dynamics". It can generate high-definition videos based on text and pictures, and can maintain high smoothness and high dynamic picture effects. As of now, Vidu can support the generation of videos up to 32 seconds in length at one time.

"Vidu can simulate the real physical world and generate scenes with complex details and in line with the laws of physics, such as reasonable lighting effects, delicate character expressions, etc. It can also create surreal content with depth and complexity." Zhu Jun said that for science fiction, western, romance, animation and other types of movies, Vidu can generate clips that conform to the corresponding styles, and can also generate film-level special effects pictures, such as smoke, glare and other effects.

The picture shows the special effects screen generated by Vidu based on the text description. (Photo provided by the interviewee)

It is reported that in terms of dynamics, Vidu can generate complex dynamic shots, support large-scale and precise action generation, can switch between different shots such as long shot, close shot, medium shot, close-up, etc. in the picture, and can directly generate long shot, focus tracking, transition and other effects.

The reporter learned from Beijing Shengshu Technology Co., Ltd. that in addition to the two basic functions of text-generated video and image-generated video, in order to provide users with a more diverse and personalized video creation experience, Vidu has also launched two new functions, "anime style" and "character consistency". In the "Image-generated Video" section, using the "Character Consistency" function, users can upload portraits or customized character images, and specify the characters in the image to do any action in any scene through text descriptions. This function simplifies the video production process and also increases the freedom of creation.

It is reported that no application is required to operate Vidu, and users can directly register with their email address to get started. Vidu's technological breakthrough stems from the R&D team's long-term accumulation in machine learning and multimodal large models. Its core technical architecture was proposed by the team in 2022 and continues to be independently developed.

Source: Xinhua News Agency

Reporter: Wei Mengjia

Editor: Zhang Ziqing

Proofreading: Qin Daixin

news

my country's self-developed video model is launched globally

Introduction

my contact information