news

Video generation has made a leap forward, and HiDream.ai's HiDream Big Model 2.0 has entered the minute-level ranks

2024-08-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Recently, HiDream.ai's HiDream Big Model 2.0 has made a major breakthrough in the field of literary videos, increasing the video generation duration from 15 seconds last year to minutes. This is another technological leap after breaking the 4-second duration limit in December last year.

The Wensheng video function of HiDream.ai's Zhixiang Big Model 2.0 has been significantly improved in terms of duration, naturalness of the picture, content and character consistency, thanks to its self-developed DiT architecture. Compared with the traditional U-Net architecture, the DiT architecture has higher flexibility and can effectively improve the generation quality of images and videos. As we all know, the basic implementation of the DiT architecture relies on Transformer technology. In order to further improve the performance of this technology, HiDream.ai's Zhixiang Big Model 2.0 uses completely self-developed modules for the entire Transformer network structure, the composition of training data and training strategies, especially in-depth research and improvement of training strategies.

The model uses an efficient spatiotemporal joint attention mechanism, which not only adapts to the spatial and temporal characteristics of the video, but also solves the speed problem of the traditional attention mechanism during the training process. In order to support the training of longer video clips, HiDream.ai's HiDream Big Model 2.0 can process video clips of several minutes or even more than ten minutes, making it possible to directly output videos of a minute length. At the same time, HiDream.ai has also developed its own Captioning Model for video description generation, which enables a detailed and accurate description of the video content.

In terms of training strategy, HiDream.ai's HiDream Big Model 2.0 uses video clips of different lengths for joint training of video and image data, and dynamically adjusts the sampling rate of videos of different lengths to complete long-shot training. In addition, the model is also reinforced based on user feedback data.habit, further optimizing performance.

Extending from the original 15 seconds to the minute level, HiDream.ai's HiDream Big Model 2.0 has now achieved a significant increase in the length of video generation, reaching the industry-leading level. In addition to the major improvement of the video length entering the minute level, another highlight of this upgrade is the variability of video length and size. Previous video generation models usually have a preset fixed length, which users cannot choose. HiDream.ai gives the choice to users. Users can directly specify the video length or let the system dynamically judge based on the input prompt content. When the content is complex, the system will generate a longer video; when the content is simple, it will generate a shorter video. In this way, the dynamic adjustment method can adaptively meet the user's creative needs. At the same time, the size of the video can also be customized according to user needs. This flexible design greatly improves the user experience.

It is worth mentioning that HiDream.ai's HiDream Big Model 2.0 has also made significant improvements in the visual experience of video images, with more natural and smooth object movements, finer detail rendering, and support for 4K ultra-clear image quality. With this technical upgrade, HiDream.ai's HiDream Big Model 2.0 is rapidly developing towards generating higher-quality multi-lens videos and accelerating towards the L3 stage. It is understood that the upgraded Wensheng video function will soon be put into use, and users will be able to experience richer and higher-quality video generation services.

Industry insiders said that with the continuous improvement and upgrading of HiDream.ai's HiDream Big Model 2.0, it is expected to bring more revolutionary changes to the field of video content creation, help users easily realize creative monetization, and promote the entire industry to move towards a broader development space.

(Source: Financial Information)

For more exciting information, please download the "Jimu News" client from the application market. Please do not reprint without authorization. You are welcome to provide news clues, and you will be paid once adopted.

Report/Feedback