doubao big model releases video generation model with accurate semantic understanding and high-fidelity image

doubao big model releases a video generation model with accurate semantic understanding and high-fidelity image quality.

2024-09-27

tan dai, president of volcano engine

“as of september, the daily average usage of tokens of the doubao model has exceeded 1.3 trillion, and the overall growth of tokens has exceeded 10 times in 4 months. in terms of multi-modality, the doubao·vensen diagram model generates images on an average daily basis 50 million. in addition, doubao currently handles 850,000 hours of voice processing per day," said tan dai, president of volcano engine.

on september 24, the 2024 volcano engine ai innovation tour was held in shenzhen, bringing the latest progress in the bean bag model. the bean bag large model family welcomes new members, newly released bean bag·video generation model, bean bag·music model, bean bag·simultaneous interpretation model. doubao general model pro and vertical models such as vincent diagram model and speech synthesis model have been greatly upgraded. the increasing types of modalities and large-scale calls have made doubao large models "stronger models, lower prices, and easier to implement". the advantages continue to be highlighted. among them, the latest version of the main model "doubao universal model pro" leads the country in various dimensions, and the model effect continues to increase.

volcano engine officially releases beanbao video generation model

comprehensively accelerate aigc application innovation

precise semantic understanding

multi-action multi-agent interaction

doubao·video generation model can follow complex prompts and unlock the ability to interact with sequential multi-shot action instructions and multiple subjects.

powerful dynamics and cool camera movements

say goodbye to ppt animation

it allows the video to coolly switch between the subject's large movements and the lens. it has multi-lens language capabilities such as zoom, surround, pan, zoom, and target following, and flexibly controls the viewing angle, bringing a real-world experience.

consistent multi-shot generation

tell a complete story in 10 seconds

successfully overcome the technical challenge of consistency when switching between multiple lenses, and achieve multiple lens switching within one prompt while maintaining the consistency of the subject, style, and atmosphere.

high fidelity and high beauty

multiple styles and proportions

supports various styles including black and white, 3d animation, 2d animation, chinese painting, etc., including six ratios of 1:1, 3:4, 4:3, 16:9, 9:16, and 21:9, suitable for various terminals , as well as different formats such as movies and mobile phone vertical screens.

behind the powerful picture effects is bytedance’s continuous investment in the research and development of video large model technology.

video generation capabilities bring innovation to many enterprise scenarios. for example, in e-commerce marketing scenarios, the beanbao video generation model not only quickly turns products into 3d dynamic multi-angle displays, but also replaces backgrounds and styles in conjunction with mid-autumn festival, chinese valentine's day, spring festival and other nodes to generate different sizes and quickly put them on shelves; in animation education scenarios, the doubao video generation model can significantly reduce the production cost of animation and vividly present the plot of fairy tales.

in addition, there are also application scenarios such as urban cultural tourism, music mvs, micro-movies, and short plays, all of which can be used to reduce costs, improve efficiency, and achieve creative compliance through the beanbao video generation model.

the doubao large model family has more members, and its model capabilities have been continuously upgraded, laying a good foundation for the multi-modal and diversified application of large models. volcano engine will continue to promote the continuous upgrading and iteration of model capabilities, explore the application of model capabilities in more situations, and continue to inject power into enterprises' use of large models to achieve intelligence on the cloud.

report/feedback

news

doubao big model releases a video generation model with accurate semantic understanding and high-fidelity image quality.

introduction

my contact information