sergey brin: google doesn't dare to use transformer, all the authors have run away, now i write code every day

sergey brin: google doesn’t dare to use transformer, all the authors have run away, now i write code every day

2024-09-12

machine heart report

editor: jiaqi

google has always been the leader in silicon valley, with the world's largest search business. the lucrative advertising revenue from search has allowed its two founders, sergey brin and larry page, to step back and enjoy life.

on september 15, 1997, sergey brin and larry page registered a website called "google".

it was not until the end of 2022 that chatgpt became popular all over the world, and google, the leader of the ai wave, seemed to realize that its status had reversed. in the past year, we seem to have gotten used to this technology giant appearing as a "follower".

since last year, there have been media reports that sergey brin has returned to the front line and is writing code himself. former ceo eric schmidt even directly criticized the lax system of "only working one day a week" in a lecture at stanford university: "if we lose to openai, we will lose to startups if we continue to lose."

schmidt's speech at stanford

at the same time, as google grows in size, some symptoms of "big company disease" are becoming more and more obvious. many google resignation "essays" show that the root cause of google's problems is not "technology" but "culture", such as employees' lack of sense of mission and the company's cumbersome systems and processes to avoid risks.

appsheet founder praveen seshadri announced his departure from google, saying in his blog that the company had lost its way and employees were trapped in the system.

what's wrong with google? the alphabet workers union said: "what really hinders google employees from being productive every day is understaffing, changing priorities, frequent layoffs, stagnant wages, and lack of follow-up from management on projects."

although google has caught up in the "chatgpt strikes back", it is somewhat different from the style of the comments section of openai that is looking forward to the release of gpt-5. when gemini is released, it always "turns over" inadvertently. the first release was faked in the demo. since then, gemini has also been criticized for generating racial bias in portraits, suggesting that everyone eat a stone a day, and using glue to stick cheese on pizza.

last month, google released an enhanced version of gemini and launched a voice assistant called gemini live that competes with gpt-4o, but gemini live still made mistakes during the demonstration.

at the made by google event in august, i failed to use gemini live's camera recognition feature for the first two times i tried it, until i changed my phone for the third time before it succeeded.

why would someone return to the forefront of technology when they are already financially independent? how does google view the frequent failures of gemini? what problems does google have in the competition among technology giants? what role will it play in this competition? at the all-in summit held yesterday, sergey brin, who has not appeared in front of the media for a long time, talked about his views in an interview.

brin's main points are:

he decided to return to the forefront of technology because the progress in the field of ai is so exciting, and as a computer scientist, he does not want to miss this wave.

ai technology is not just an extension of search; it will lead to broader changes.

compared with "expert models" that specialize in a certain field, brin is more optimistic about general models. google successfully won the imo silver medal model, which stems from google's previous attempts to integrate certain knowledge and capabilities in formal proof models into general language models.

there is a constant demand for computing power, but it is unlikely that there will be a surge in demand from 100 megawatts to 1 gigawatt, 10 gigawatts, or even 100 gigawatts.

in the field of artificial intelligence applications, brin believes that biology has already achieved a good level of practical application of ai technology, while the field of robotics is still at the stage where people feel it is magical after watching the demonstration, and has not yet reached the level for daily use.

although ai occasionally makes big mistakes, it should be released in a timely manner. ai is not a technology that you hold tightly in your arms and hide until it becomes perfect. what is more terrible than ai "making stupid mistakes" is that google was too timid at the time and did not dare to deploy transformer, and the authors of the paper all left.

competition among technology giants in the field of ai is actually a good thing, but brin will still keep a close eye on the big model rankings.

the following is the full interview:

brin: i originally thought i was just here to participate in a podcast, but i didn't expect there to be so many audience members here, congratulating you on your successful career. i was a little shy.

host: thank you for taking the time to chat with me. today, ai is at the tipping point of changing the world. in 1998, you and larry page founded google. i heard that you have recently started researching ai at google. large language models and conversational ai tools are a threat to google search, which is a topic of debate among many industry analysts and experts. so how long do you sit in the office every day at google now? what do you do?

brin: honestly, i go to work almost every day, but i missed a day today because i had to be on your show. as a computer scientist, i have never seen such exciting progress in ai as in the last few years. the progress in ai is really amazing!

back in the 1990s, when i was a graduate student, ai was barely a topic in the curriculum, and at best was just a footnote in the textbook. the textbook said that people had done all kinds of experiments, but ai really didn't work, and that working on ai was a "dead end." that's all you need to know about ai.

and then somehow, miraculously, these people working on neural networks started making progress on ai methods that had been discarded in the 60s and 70s - more computation, more data, smarter algorithms..... it's amazing what's happened in the past decade. today's ai tools are showing new capabilities almost every month, and these capabilities are doubling very quickly. it's really amazing what computers are capable of. so i decided to get back to the front line of technology because i didn't want to miss out on all of this that i could experience as a computer scientist.

host: do you think ai is an extension of search, or will it redefine the way people retrieve information?

brin: i think ai is touching every aspect of our daily life, search being one of them. ai is affecting almost everything, like programming. i now have a different view on ai programming. it's really hard to write code from scratch, especially compared to directing ai programming, right?

host: what have you written using ai?

brin: i actually write a little bit of code myself, but just for fun. i also sometimes have ai write code for me, which is fun. for example, i wanted to know how well google's ai model played sudoku. so i had the ai model write a lot of code to automatically generate sudoku puzzles, and then fed those puzzles to the ai for grading. the ai was perfectly capable of writing that code.

but when i talked to the engineers about this, there were a few rounds of debates back and forth, and i came back half an hour later and found that the ai was done. they were shocked, and it was clear that they didn't use ai tools to assist in coding as often as i thought.

sudoku

host: that's hilarious. some models are good at solving sudoku puzzles, some can answer factual information in my world, and some are specialized in designing houses. at the same time, many researchers are working on developing general-purpose large-scale language models. which direction do you think will develop in the future?

i don't know where this idea came from, saying there will be a "god model". that's why investors are pouring money into ai. once the "god model" is developed, you can "reach the sky in one step". when you have agi, you can rule everything. or there will be many small models based on specific applications, collaborating in intelligent agents. how do you think model development and application will evolve in the future?

brin: if you look back 10 to 15 years ago, different ai techniques were used to solve completely different problems. for example, ai for playing chess was very different from image generation technology. they were very different.

moderator: just like recently google released a gnn model, which outperformed all physical prediction models. i'm not sure if you know this, but it was indeed released by google.

brin: that’s awesome, but i don’t know (awkward).

moderator: this model is a completely different architecture.

brin: from a historical perspective, there are indeed many different systems in ai. take the recent international mathematical olympiad (imo) as an example, google's model won the silver medal, just one point away from the gold medal (for details, please refer to the previous report of synced: google ai won the imo silver medal, mathematical reasoning model alphaproof was released, and reinforcement learning is so back).

we actually have three ai models: one for theorem proving, one for geometry, and a general language model. however, a few months ago, we started trying to learn from our previous work and started to incorporate some of the knowledge and capabilities of the formal proof model into the general language model.

this is still work in progress, but i think the trend will be towards building a more unified model. i'm not sure it's the so-called "god model", but it's certain that we are moving towards some kind of shared architecture or even a shared model.

host: if this is the direction of the future, then in order to train and improve that super-large model, huge computing resources will be required.

brin: computing power is indispensable. i have read some articles that predict that the demand for computing power will surge from 100 megawatts to 1 gigawatt, 10 gigawatts, or even 100 gigawatts. i have reservations about this. in recent years, algorithm innovation and optimization have brought more significant performance improvements than increasing hardware computing power.

moderator: so, is the current massive investment in computing power unreasonable? everyone talks about nvidia's revenue, profits, and market value. it supports the growth of super-scale computing and infrastructure, making it possible to build these huge models. does this trend really not make sense? maybe it does make sense, otherwise why can nvidia make so much money?

brin: first of all, i am not an economist or market analyst, and my views are based only on the perspective of a computer scientist. for us, we are building computing clusters as fast as possible because of the huge demand. for example, google cloud customers just want a lot of ppus, gpus, and everything else. we have to turn customers down because we don't have enough cards ourselves, and we also rely on these resources internally to train and deploy our own models. therefore, i think it is reasonable that major companies are currently actively expanding computing power. i just think it is difficult to directly infer from the current situation that the future computing power demand will grow from "100 mw to 1 gw, 10 gw, or even 100 gw".

host: but the business needs are there.

brin: i understand that customers have a wide range of needs. they want to perform reasoning tasks on various ai models and apply these models to endless new scenarios. their needs are currently unlimited.

moderator: in the application field of ai, whether it is robotics or biology, which areas do you think have achieved the most significant achievements? are there any use cases that make you think "wow, this is so useful"? which areas are more challenging and may take longer than expected to be implemented?

brin: my answer is biology. alphafold has been around for a while. it has been around for a while, and when i talk to biologists, almost everyone is using it. the latest version of alphafold, alphafold 3, represents a new type of ai technology. as i mentioned before, i believe the future trend is the unification of models.

when it comes to robots, i’m in a “wow phase”, like, “wow, robots can actually do housework!” but you have to understand that behind it may just be a slightly tweaked general language model, and although it’s amazing, in most cases, they haven’t reached the level of daily use.

host: do you see the prospects of robots?

brin: maybe... but i didn't see any specifics...

host: but didn’t google also have a robotics business? although it was later separated and sold.

brin: google was in the robots business.

host: maybe it's just the wrong timing.

brin: frankly, that was probably because we were ahead of our time. boston dynamics had so many hits, but i can't even remember what google had. anyway, we had five or six embarrassing products, but they were cool and impressive. just looking at how capable general language models are now, and how multimodal technology allows robots to understand scenes, it's still a bit silly to think about it. there was no ai technology at the time, and we were like treading on a treadmill, and it was difficult to move forward.

google had a good hand in developing robots: andy rubin, the father of android, boston dynamics, the famous robot manufacturer, atlas, the famous humanoid robot... however, in just five years, the plan was disbanded and reorganized, disbanded and reorganized again. executives left one after another, sales plans were stopped, and several major companies were sold...

moderator: you have invested a lot of time in the research and development of core technologies. do you also invest considerable energy in products? in a future world where ai is everywhere, how will the way of human-computer interaction evolve, and how will our daily lives change?

brin: this seems to be a topic for conversation with colleagues in the tea room.

host: would you mind sharing it with us?

brin: never mind, i'm struggling to recall something that's not embarrassing.

host: you can also tell the story of "you have a friend".

brin: it's really hard to say what the future will be like. ai technology is the basis for realizing applications. for example, someone released a demo that was amazing, but it takes time to go from the demo to the actual implementation in production. i don't know if you have tried the astra model, you can play a real-time video with it, and it can tell what is happening in your environment.

host: you can use it, right?

brin: i will definitely get access. sometimes i may be the last one to get access. we’ve gotten to the point where people experience ai and say, “oh my god, this is amazing.” and then you think, “okay, it works 90% of the time.” but then you question, “if it’s wrong or slow 10% of the time, is that really good enough?” so we have to work on those details and make sure it’s fast and reliable and so on. when that happens, it’s really an amazing achievement.

host: i heard a story that i should have told you before i went on stage. before a press conference, a group of engineers showed you that ai could write code, and they said, "we haven't deployed it in gemini yet because we want to make sure it doesn't go wrong." google has some of this "hesitant" corporate culture. at the time, you said, "no, since it can write code, it should be launched." many people have told me this story. because they think, "it's extremely important to hear such remarks from you, the founder, that it shows that conservatism has not completely taken over google, and we look forward to seeing google continue to lead innovation." is this description accurate? did you really say that?

brin: i don't remember the details, but it does sound like something i would do, to be honest.

host: to me, that becomes a problem because google is so big that if they make a mistake, the losses would be huge.

brin: i am still afraid of something. the starting point of language modeling can be traced back to a transformer paper 6 or 8 years ago. but all the authors of these papers have left google. congratulations to them! at that time, we were too timid to deploy transformer.

brin: and no matter how powerful ai is, they will still make mistakes sometimes and still say embarrassing things. but at the same time, ai can already help us do things we have never done before. for example, i program with my children to deal with some extremely complex problems.

just by asking the ai, they can start programming and learn all kinds of complex apis and tools that would normally take a month to learn. this ability is almost magical. we need to be prepared to face some mistakes and be willing to take risks. i believe we have improved in this regard. of course, you may have seen many moments when ai "makes fools", but...

host: that's acceptable. after all, you're already financially free and have a huge amount of stock. i mean, you're willing to accept these embarrassments because it's very important to do so at this stage.

brin: i'm not doing this for my stock, okay? but think about it, am i really okay with these mistakes? is this the magical thing we're presenting to the world? i think what we need to convey is, "look, this thing is magical." ai will occasionally make big mistakes, but i think we should release it in time and let people experiment with it and see what new applications they can find. ai is not the kind of technology that you hold tightly in your arms and hide until it's perfect.

moderator: do you think that ai is having such a profound impact on the world and creating such huge value that this is no longer a simple competition between google, meta, and amazon? everyone sees it as a business war, but is it possible that the pie created by ai is so big and the areas you are exploring are so broad that it is far more than who built the highest scoring model or who has the best llm performance? how do you view the broad prospects brought by ai and what role will google play in it?

brin: i think competition is very helpful in a way because all the big tech companies are competing, and by the way, a few weeks ago on a certain leaderboard, google was number one, and the last time i checked, we still beat the top models. it's just...

moderator: several indicators are not good. so you do care about the model score!

brin: i didn't say i don't care. when chatgpt came out, google was really behind, and now we've made great progress. i'm very happy with all the progress google has made. so we'll definitely keep a close eye on the model leaderboard. i think it's a good thing that there are so many ai companies, whether it's openai, anthropic, or mistral, which means that the ai field is expanding rapidly and is very vibrant.

to answer your question, i think ai has tremendous value for humanity. if you think back to my college days, there was no internet as we know it today, and it took a lot of effort to get basic information and communicate with people. before the popularization of mobile phones, we had already achieved a huge increase in capabilities worldwide, and today's ai technology is undoubtedly another major leap in capabilities. now, almost everyone can access ai in some way. i think this is very exciting and it's really great.

news

sergey brin: google doesn’t dare to use transformer, all the authors have run away, now i write code every day

introduction

my contact information