news

to solve the problem of consistency of video generation model, the chinese version of "sora" vidu released the "one picture to lock the subject" capability

2024-09-11

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

on september 11, shengshu technology held a media open day event and released the "subject consistency" function, which can achieve consistent generation of any subject, making video generation more stable and controllable. it is understood that this function is currently open to users free of charge.

earlier in late april, vidu, an original video model jointly developed by shengshu technology and tsinghua university, was released globally. it was officially launched at the end of july and is now fully open for use.

tang jiayu, ceo of shengshu technology, told reporters including the daily economic news on the open day that the "subject reference" function is intended to solve the "uncontrollable" limitations of video models. currently, video models have the limitations of weak continuity and random output. weak continuity means that the consistency of the subject, scene, style, etc. cannot be guaranteed each time a video is generated, which is particularly obvious in cases involving complex interactions. random output means that the output results are relatively random and require continuous generation and trial, and it is not possible to achieve fine and accurate control of details such as camera movement and lighting effects.

tang jiayu, ceo of shengshu technology. image source: photo taken by li shaoting, reporter of china business network

previously, the industry tried to adopt the method of "first ai generates pictures, then pictures generate videos", using ai drawing tools to generate storyboards, first keeping the main body consistent at the picture level, and then converting the pictures into video clips and editing and synthesizing them.

under the "subject reference" function, users can upload a picture of any subject, lock the image of the subject, switch scenes arbitrarily through descriptive words, and output a video with the same subject. this function is not limited to a single object, but is aimed at "any subject", including people, animals, commodities, as well as anime characters, fictional subjects, etc.

daily economic news

report/feedback