2024-10-04
한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina
new wisdom report
i'll be joining google deepmind to work on video generation and world simulators! can't wait to work with such a talented team.
i am hereOpenAIthe two years it took to create sora have been an amazing time. thank you to all the passionate and kind people i work with. excited for the next stage!
press the gourd and start the scoop. it seems that the official announcement of resignation on the release day can become the tradition of openai.
google bosses celebrated in the comment area, including jeff dean, chief scientist of deepmind and google research, and logan kilpatrick, product leader of google ai studio.
denny zhou, founder and leader of the deepmind inference team——
although alexis conneau, the head of the "her" project who resigned before the full release of gpt-4o, did not join google, he also started making jokes online - welcome to become a former openai employee.
it seems that google’s own video generation model veo is expected to surpass sora.
currently, bill peebles, sora’s other co-lead, is still working at openai.
although it was released in february this year, sora is still a "futures model" and is only open to a small group of red team testers and artists.
openai has not given a clear deadline for when it will go online, unlike the "her" project which has a "this fall" flag.
, and encountered the cto and person in charge leaving one after another, sora’s future was once again uncertain.
personal experience
tim brooks co-leads the sora project at openai, where his research focuses on developing large-scale generative models that can simulate the real world.
this guy received his phd from the berkeley ai research center, and his doctoral advisor was alyosha efros. during his ph.d., he proposed a technology called instructpix2pix.
before joining openai, he participated in the development of ai technology for pixel mobile cameras at google, and alsonvidiaworked on video generation models.
at the same time, he is also the main researcher of dall·e 3.
another part of his resume is really exciting - his photography works have won awards from "national geographic", "nature's best photography" and "national wildlife federation".
his animal photos:
he has also performed at the beacon theater on broadway in new york and won awards in international a cappella ventriloquist competitions.
netizens expressed their envy that he has this kind of freedom.
moreover, tim brooks also stated in his resume rather "versailles": "i am passionate about ai, and fortunately, this passion blends perfectly with my hobbies in photography, movies, and music."
after joining deepmind, i said that i will still be engaged in work related to video generation and world simulators, and continue to integrate my passion for ai with my hobbies for photography and movies.
from video generation to simulated world
in april this year, just two months after the release of the sora model, co-leads tim brooks and bill peebles participated in a keynote speech organized by agi house and expressed their views on video generation technology - "it will simulate everything. implementing agi".
vincent video models, such as sora's demonstrated ability to generate complex scenes, gradually reveal a detailed understanding of human interaction and physical contact, which is an important step for agi.
to generate videos with realistic content and realistic images, an internal model is needed to understand how all objects and humans move and interact in the environment. therefore, they believe that sora will contribute to the development of general artificial intelligence.
in terms of methodology, both tim brooks and bill peebles particularly emphasized the scalability of the model. they believe that the reason why the language model is so successful is its ability to expand, and quoted the views in "the bitter lesson":
in the long run, methods that improve performance as they scale will ultimately win out as computing power increases.
by creating a transformer-based framework and comparing different sora models, they demonstrated the impact of increased computation in model training on performance improvements.
from the basic model to a model with a 32-fold increase in calculations, you can see a gradual improvement in the understanding of scenes and objects.
we've always strived to keep our approach simple, although sometimes the reality is more challenging than it sounds.
our main focus is to make something as simple as possible and then scale it massively.