news

Transformer author warns: You can’t beat OpenAI just by selling models!

2024-08-24

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Taole from Aofei Temple

Quantum Bit | Public Account QbitAI

Transformer EightAidan Gomez, the youngest of the group, lamented in his latest interview:

It’s really not profitable to just sell models!

GoogleAidan Gomez, the founder of Transformer, is one of the authors of Transformer, which has had a profound impact on the field of AI.

And now Aidan Gomez is valued at $5.5 billionCohereCo-founder and CEO of .(Previously launched the Command R series open source large model)

In this conversation with Harry Stebbings, founder of 20VC, Aidan Gomez talked about the development trends of AI.

Some of these topics have attracted the attention and discussion of netizens, such as:

  • Scale is not the only way to improve model performance

  • Only selling models cannotOpenAIcontend

  • AI startups should not rely on cloud vendors

  • Optimistic about the field of robotics, predicting a major breakthrough within 5 years

  • Data quality is critical to the model

For more details, please see the text version below~

In addition to computing power, data and model innovation can also improve AI performance

Q: Before I start, I would like to ask you a question. Did you like playing games when you were a child?

Aidan Gomez: I do like games, and I’ve loved technology since I was a kid.

Q: That is to say, you will never start a game from a very difficult first level that makes people feel "it's impossible to complete, I don't want to play it anymore."

Aidan Gomez: Yes, this is called “curriculum learning” in machine learning. You start by teaching the model to do something very simple, and then gradually increase the complexity, building on the existing knowledge.

Interestingly, curriculum learning is actually a failure in machine learning. We don’t really do curriculum learning, we just throw the hardest and easiest material at the model at the same time and let it figure it out on its own.

But for humans, this works really well and is an important part of how we learn. It's really interesting to see that it hasn't been successful in machine learning.

Q: You just mentioned throwing everything at the model, and I want to dig into this question directly. A lot of people say that we just need more computing power and performance will improve. Do you think this is correct? Are there other factors that limit performance improvement?

Aidan Gomez: Indeed, if you add more computing power to the model, or make the model larger, it does get better.This is the most reliable way to improve model performance, but also the dumbest.

For those with enough capital, this is a very attractive strategy with very low risk. You know it will get better, just scale the model, spend more money, buy more computing power. I believe this, I just think it is extremely inefficient.

There is a better way.If you look at the last year and a half, for example,ChatGPTPosted till nowGPT-4If GPT-4 does have 1.7 trillion parameters as they say, it is a hugeMoE

We now have models that are much better than this one, and they only have 13 billion parameters.So the speed of this change, or the extent to which costs are falling so rapidly, is simply incredible and even a little surreal.

So yes, you can achieve the quality of your model by scaling it up, but you probably shouldn’t.

Q: Will this kind of progress continue? I mean, will we continue to see progress at this scale or will it hit a plateau at some point?

Aidan Gomez: Yes, it does require exponential input. You need to keep doubling computing power to maintain linear growth in intelligence. But that growth can only last for a very, very, very long time.

It's going to get smarter and smarter. But you run into economic constraints. Not a lot of people bought the original GPT-4, especially a lot of businesses because it was very large, very expensive, inefficient to run, expensive, and it wasn't smart enough to justify that cost.

Therefore, there is a lot of pressure in the market to make models smaller and more efficient, and to make models smarter through data, algorithms, and methods, rather than just relying on expansion of scale.

Q: In a world where we live in a world where there are smaller, more efficient verticalized models designed for specific use cases, will there be a few large, one-size-fits-all models? Or will there be a mix of both?

Aidan Gomez: One trend we’ve seen over the last few years is that people like to prototype with a general, intelligent model. They don’t want to prototype with a specific model and spend time fine-tuning the model to make it particularly good at the things they care about.

What they want is to grab a big expensive model, prototype with it, prove it can do the task, and then refine it into an efficient model that excels in a specific area. So this model really emerges.

As a result, we will continue to live in a world where multiple models coexist, some verticalized and focused, others completely horizontal.

Q: For example, OpenAI is spending $3 billion right now. How can you stay in this race unless you areMicrosoft、Amazon、Google、FacebookSuch a company?

Aidan Gomez: If you're just doing scale projects, you really need to be one of these companies, or a subsidiary of one of these companies. But there are a lot of other things you can do.

If you don't rely entirely on scale as the only path forward,If you believe in data innovation or innovation in models and methods, there are many directions to explore.

Q: Can we explore in depth what is data innovation and innovation in models and methods?

Aidan GomezAlmost all of the major advances we’ve seen in open source have come from improvements in data.By obtaining higher quality data from the Internet, improving web crawling algorithms, parsing web pages, extracting important parts, and improving the weight of specific parts on the Internet, because there is a lot of duplicate and junk content on the Internet.

By extracting the most valuable, knowledge-rich parts and emphasizing them to the model, as well as the ability to generate synthetic data, these allow us to obtain large amounts of text or web content without human involvement, and these data are automatically generated by the model.

These innovations,The ability to improve data quality, in particular, is driving much of the progress we’re seeing today.

Q: Okay, this is data innovation, what about model innovation?

Aidan Gomez: This involves things like new reinforcement learning algorithms. You know, there's been a lot of buzz about Q* and the changes that it could bring. Ideas around search, like how to search for solutions.

The current state of the model is that I ask you a question and your model needs to give you the correct answer immediately. This is an extremely demanding requirement for the model, right?

You can't do that with humans, you can't ask a person a hard question and expect them to spit out an answer immediately. They need time to think and process.

Q: They sometimes need a little brainstorming time.

Aidan Gomez: Yes, it does. So a very obvious next step in the evolution of models is that you need to let them think and solve problems. You need to let them make mistakes, try something, fail, understand why it failed, and then backtrack and try again.

Currently, there is no concept of problem solving in the model.

QQ: You mentioned problem solving. Is this the same concept as reasoning?

Aidan Gomez:Yes.

Q: Why is reasoning so difficult? Why don’t we have the concept of reasoning yet?

Aidan GomezReasoning is not difficult, but the difficulty is that we don’t have much training data on the Internet that shows the reasoning process.Most of what is on the Internet is the output of the reasoning process.

When you write something online, you don’t show your thought process, you go straight to your conclusions, your ideas, which are the result of a lot of thought, experience, and discussion.

So we lack such training data, it is not freely available, you have to build it yourself. So, like Cohesion, OpenAI andAnthropicWhat companies like this are doing is collecting data that shows the human reasoning process.

Just selling models, can’t beat OpenAI

Q: Speaking of which, I’m wondering how you feel about competing with OpenAI’s user-generated content initiatives?

Aidan Gomez: It’s hard, especially in the enterprise space, where we face a huge challenge: the privacy and confidentiality of customer data.

They view the data as intellectual property and contain a lot of confidential information, so they don't allow us to use the data for training. I completely understand this position. For this reason, we have shifted our focus to synthetic data and invested a lot of resources in this area.

We also built a team of human annotators and partnered with Scale AI, but since we are not a consumer-facing company, we had to generate the data ourselves, even though it put a lot of pressure on us.

Fortunately, our focus is relatively narrow, concentrated in areas where businesses have clear needs, such as automating financial and HR functions. This allows us to delve deeper into and address those specific needs.

Looking ahead to the next decade, how will the synthetic data market develop? Will it be dominated by a few vendors? Currently, it seems that the market for large language model APIs is mainly driven by synthetic data, with many people using expensive large models to generate data in order to optimize smaller, more efficient models.

It’s uncertain whether this model is sustainable, but I believe that as new tasks, new problems, and data demands continue to emerge, we will have to adapt and meet them, both from models and humans.

Q: So what will the synthetic data market look like? Will it be dominated by two or three vendors?

Aidan Gomez: I've heard that the current big model API market is dominated by synthetic data. Most people use these large, expensive models to generate data, which they then use to fine-tune smaller, more efficient models.

So they're basically refining larger models. I don't know if that model is sustainable as a market. But I do think there will always be new tasks, new problems, or new data needs.Whether this data comes from models or humans, we must meet these needs.

Q: One thing that worries me, or makes me hesitate, is that you see OpenAI competing on price, you seeMetaSuch companies release models for free without clearly explaining the value of open source and open ecosystems.

Are we seeing a true decline in the value of these models? Is this a race to the bottom, or even a race to zero?

Aidan GomezIf you're just selling models, it's going to be a very tough game for the next while. It's not going to be a small market.

Q: There will be a lot of people who just sell models, and there will be some who sell models and other things.

Aidan Gomez: I don't want to name names, but I can say that, for example, Cohere now only sells models. We have an API through which you can access our models.

This is going to change very soon. The product landscape is going to change and we are going to add new things to the existing products. If you only sell models, it will be very difficult because it will become a zero-profit business and the price competition is too fierce. A lot of people are giving away models for free.

Still, it will be a big business, with demand growing very quickly, but profits, at least at this stage, will be slim.

That's why there's a lot of excitement at the application level. The discussion in the market is correct that value is happening below the chip layer, because everyone invested a lot of money in the chip to build these models at the beginning, and then saw the value manifest at the application level, such as ChatGPT, which charges per user, $20 per month.

This seems to be where the value is happening right now. The model layer is an attractive business in the long term, but in the short term, as it stands,It is a very low-margin, commoditized business.

AI startups should not become vassals of cloud vendors

Q: Many people now think that it is too late for startups to enter the field of AI models. However, as the cost barrier is reduced, does this make it easier for startups to enter this field?

Aidan Gomez: Indeed, every year the cost of building last year’s model drops by 10x or even 100x.Thanks to better data and cheaper computing resources, the barrier to entry for previous generation models has been lowered.

The problem is, no one really cares about outdated models. Last year’s models are almost worthless compared to this year’s models. Every technological advancement makes old technology quickly obsolete, and the cost of AI development is rising dramatically.

It might have cost only $10 million to develop version 1, but it might have cost another $1 million to $2 million to make version 2 a slightly better model. Now, it might cost $3 billion to develop a new model, and $5 billion to update it.

This growth is no longer linear, but orders of magnitude. I am not sure that the development of a new generation of technology is always cheaper than the previous one. Take chips and other complex technologies as an example. Although the development cost is rising, we continue to develop it because it is worth it.

Q: So what you're saying is that people don't actually care whether those improvements last?

Aidan Gomez: Exactly. What I mean is,Improving these models is becoming increasingly difficult and the resistance is growing.Another interesting observation is that as models become smarter, the ability of average people (including me) to tell the difference between them decreases.

Because we have limited expertise in medicine, mathematics, physics, etc., we can't really feel these changes. Models have done a very good job of basic knowledge, which is the level of knowledge we have.

Therefore, it is difficult to feel the difference between different generations of models when we interact with them, but in fact, these models have made huge progress in some specific capabilities or pure intelligence.

As for whether it is worthwhile to continue to invest a lot of money to advance technology, I think the answer is yes. Even if these technologies may not be important to ordinary consumers, they are very valuable to researchers in certain professional fields.

We help them make more progress by providing those tools. It's like asking if we should continue to invest in the next generation of technology, like creating a new material for a spacecraft to make it more efficient in getting to orbit.

While this may not matter to most people, it is very important to those who need it, and there is market demand, which is what keeps technological progress going.

Q: Let's go back to the cost issue. Obviously, the cost is high and will continue to increase in the future. You mentioned earlier the concept of "effective affiliates".

With many companies being acquired or merged, and cloud services receiving a lot of attention as a driver of continued growth, do you think that in the next three to five years, most small model providers will be acquired by large cloud service providers?

Aidan Gomez: I think this space is definitely going to go through a consolidation and it’s already starting to happen.Many model developers have been absorbed by large cloud service providers such as Amazon.

I believe this will happen more often in the future. But it should be noted thatBecoming an affiliate of a cloud service provider can be riskyThis is not a good sign for business development.

Normally, to raise money, you need to convince investors who only care about return on capital. But when you raise money from cloud service providers, the situation is completely different.

Q: So do you think venture capitalists have made money from model investing in the past few years?

Aidan Gomez: For Cohere's investors, they will definitely make a lot of money.

I’m happy for the people who believed in us. Jordan Jacobs from Radical Ventures, our first investor, is still on our board and very actively involved in building the company. I even call him the fourth co-founder of Cohere.

Q: Recent media reports said the company’s valuation is slightly higher than US$550 million. Does this put pressure on you?

Aidan Gomez: It is a pressure, but it is also a positive pressure. Eventually every company will face the consideration of revenue multiples, which will eventually converge with the multiples of the public market.

I think we are actually in a much better position than many of our peers because our valuations haven't grown as crazy as some of the other companies. We still have a lot of growth to do, but I'm confident in the market.

Currently margins are under some pressure due to price competition and the free model, but this will change over time. Cohere's product portfolio will also continue to evolve and develop.

Nothing can replace humans

Q: If you were an investor at 20VC now, where do you think the opportunities are?

Aidan Gomez: The product space and application space are still very attractive. These technologies will bring new products, and they will change social media. People like to communicate with these models and the usage time is amazing.

Q: Do you think this is a good thing? I don't want my children to live in a world where they are communicating with generative systems, imitating humans. I don't want them to get satisfaction from having a conversation with a model.

Aidan Gomez: You might be wrong. You might want your child to be able to interact with an extremely compassionate, extremely intelligent, knowledgeable, and safe agent.

It can teach them things, play with them, it won't lose its temper with them, it won't get mad at them, it won't bully them, it won't make them feel insecure.

certainly,Nothing can replace humansThere’s no world where nothing can replace humans, we won’t suddenly all start dating ChatBots causing the human birth rate to drop.

I don't think that's going to happen, right? I want to have a baby, and I can't have one with a ChatBot.

A human companion is far more valuable to me than any chatbot. Just like in the workplace, I don't think we can completely replace humans.AI will enhance human capabilities and make humans more efficient, but this does not mean that there will be fewer jobs.

You can't replace humans. Think about sales. If I was being sold by a robot, I wouldn't buy. It's that simple. I don't want to talk to a machine.

Sure, some simple purchases could probably be handled by a robot, but for those that are really important to me and my company, I want the other party to be a real person who can take charge.

I need someone who has the power to intervene if something goes wrong. So I really think that whether it's on the consumer side, where we become addicted to talking to chatbots, or on the work side, where jobs are going to disappear and there's mass unemployment, I don't see that happening.

Q: I agree with your point, but I do worry about low-end jobs. For example, a customer service team may lose 70% to 80% of its employees, and there will definitely be partial replacement.

Aidan Gomez: There will certainly be some local substitution. But overall,This will be growth, not replacement.Certain roles are really susceptible to technology, and customer support is one of them.

But ultimately, there will still be people who need to do these jobs, just probably in smaller numbers than today. But customer support is a tough role, it's a very psychologically draining job. If you've ever listened to those call recordings, you know it's an emotionally draining job.

Q: Yes, it's a bit like content moderation on social media platforms, which is also psychologically traumatic in many ways.

Aidan Gomez: Every day you wake up, go to work, and you get yelled at and have to apologize all day long. So maybe we should let the models handle those conversations and let humans handle the customer support issues that really require human help, like, solving a problem, not an emotional complaint, but an opportunity to make this person's life better.

The next big breakthrough in AI will come in robotics

Q: What do you think AI cannot do today, but will become a reality and bring about huge changes in three years?

Aidan GomezThe next big breakthrough in AI will come in the field of robotics.Costs need to come down, but they are already coming down. And then we need more powerful models.

Q: Why do you think there will be a major breakthrough in the field of robotics?

Aidan GomezBecause a lot of the barriers have disappeared.Previously, robots’ reasoners and planners were very brittle, you had to program them for each task, and they were hard-coded into a specific environment.

So you have to have a kitchen that's laid out exactly the same, the same dimensions, nothing that's different, and that's very fragile. But in research, by using the underlying models, the language models, people are actually developing better planners that are able to reason about the world more naturally.

So there are already a lot of companies working on this, and it may not be long before someone cracks the problem of making a general-purpose humanoid robot, making it cheaper and more stable.

It's going to be a huge shift. I don't know if it's going to happen in the next five years or ten years, but it's definitely going to happen in that time frame.

Data quality cannot be ignored

Q:It's really interesting to chat with you today. I want to have aQuick Questions and Answers, I give a statement and you immediately give your thoughts

What has changed your perspective the most in the past 12 months?

Aidan GomezThe importance of data.I severely underestimated it. I used to think it was all about scaling, but a lot of things that happened inside Cohere completely changed my understanding of what was important in building this technology.

Data quality is critical.Quality, like one wrong example out of billions of data points can have a significant impact on the model. It's a little unreal, the sensitivity of the model to the data is so high, everyone underestimates it.

Q: How much funding has your company raised so far?

Aidan Gomez: Approximately US$1 billion.

Q: Which round of fundraising is easiest?

Aidan Gomez: Probably the first round. It was just a simple conversation, where they said, “Here’s a couple million dollars, give it a try.” So I think that was a pretty easy round.

Q: When you raise $500 million, the process is definitely more complicated. When you saw the $500 million arrive, was it a little bit unbelievable?

Aidan Gomez: A little bit. Like $25 million a year, I'm not sure of the exact number, but it's a lot of money. Cohere changed my view of economics and money, and now $500 million is not a big number to me.

Q: Does this worry you?

Aidan Gomez: No, it is part of our strategy. If we are willing to accept such conditions, we can accept them. But our strategy is to remain independent and develop independently.

Q: If you could choose any world-class board member, who would you choose?

Aidan Gomez: Mike Volpi and Jordan Jacobs, who are now on my board of directors.

Q: Why do you think Mike is a good board member?

Aidan Gomez: Mike is amazing and seems to have been through it all. I can ask him questions about almost anything and he has been through similar experiences and can give great advice.

Q: Jeff Hinton and Yann LeCun, who do you prefer?

Aidan Gomez: Definitely Jeff, he and I have a closer personal relationship.

Q: Do you think Yann is too optimistic?

Aidan Gomez: No, I agree more with Yann's view on AI. Jeff is more inclined to doomsday predictions, while Yann is more optimistic. Although Yann is a bit like Elon Musk's "reply man" now, Jeff is indeed a smart and thoughtful person.

Q: Last question, what is a question you think you’ve never been asked but should be asked?

Aidan Gomez: People always ask me about the future of technology and potential risks, but rarely discuss the opportunities that technology brings.

Q: So where do you hope the future of technology will go?

Aidan Gomez: I think we should use technology to make the world more productive, to increase supply and make things more abundant and cheaper. Productivity may not sound very sexy, but if a 5% productivity increase is applied to the NHS

This will have a major impact on the state of the country, the budget and the lives of millions of people. So I think our first priority should be to increase productivity and growth.

Video address:
https://www.youtube.com/watch?v=FUGosOgiTeI