news

"The most powerful and low-cost"! OpenAI releases GPT-4o mini to join the small model competition

2024-07-19

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

OpenAI launched "GPT-4o mini" on Thursday Eastern Time, joining the competition for "small and exquisite" AI models, saying that this new model is "the most powerful and low-cost model" and plans to integrate images, videos, and audio into this model in the future.

More than 60% cheaper than GPT-3.5 Turbo, chat performance is better than competitors

The company said GPT-4o mini is available to ChatGPT free users, ChatGPT Plus and team subscription users starting Thursday, and will be available to ChatGPT enterprise users next week. GPT-4o mini will replace the old model GPT-3.5 Turbo in ChatGPT. OpenAI said GPT-4o mini costs 15 cents per million input tokens and 60 cents per million output tokens, which is more than 60% cheaper than GPT-3.5 Turbo.

The company also said that the new model currently outperforms the GPT-4 model in chat preferences and scored 82% in the Large-Scale Multi-Task Language Understanding (MMLU) test. Media reports that MMLU is a text intelligence and reasoning benchmark used to evaluate the capabilities of language models. A higher MMLU score indicates that it can better understand and use language in a variety of fields, enhancing its application in the real world.

According to OpenAI, the GPT-4o mini model scored 82%, higher than two other low-cost competitors, Google's Gemini Flash scored 77.9% and Anthropic's Claude Haiku scored 73.8%.

Among larger models, GPT-3.5 scored 70% on this test, GPT-4o scored 88.7%, and Google claims its Gemini Ultra achieved the highest score ever at 90%.

The analysis suggests that smaller language models require less computing power to run, making them a more affordable option for companies with limited resources to deploy generative AI.

In addition, this new lightweight model will also support text and visual features in the API, and OpenAI says it will soon handle all multimodal inputs and outputs, such as video and audio. With these features, this could be like a more powerful virtual assistant that can understand your travel itinerary and make recommendations. However, the model can currently only be used mainly for simple tasks.

The competition for "small but fine" AI models is fierce, and OpenAI is the last to enter the game

Media reports show that Microsoft-backed OpenAI has a valuation of more than $80 billion. Although it still occupies a leading position in the generative AI market, the company is facing increasing competitive pressure. OpenAI also needs to find a way to make money because the company spends a lot of money on processors and infrastructure to build and train its models.

However, many companies cannot afford large, more expensive models, so lightweight and inexpensive models may be more popular. Until then, many developers will choose Claude 3 Haiku or Gemini 1.5 Flash instead of paying the high computing costs required to run the most powerful models. For example, a smaller model may be best for automating high-volume, basic tasks, while a larger model may handle more complex work. Some developers may want to use both models in one application.

In an interview with the media, Olivier Godement, head of API products at OpenAI, explained why the company failed to launch a "small and exquisite" AI model earlier. He said that this was purely a matter of "priority" because OpenAI focused on creating bigger and better models, such as GPT-4, which requires a lot of manpower and computing resources. Over time, OpenAI noticed that developers were increasingly eager to use smaller models, so the company decided that now was the time to invest resources in the development of GPT-4o Mini.

"Our mission is to make cutting-edge technology, build the most powerful and useful applications, and we certainly hope to continue to make cutting-edge models and promote technological progress," said Olivier Godement, head of API products at OpenAI, in an interview with the media. "But we also want to have the best small model, and I think it will be very popular."

"I think GPT-4o Mini truly embodies OpenAI's mission to make AI more popular. If we want AI to benefit every corner of the world, every industry, and every application, we must make AI more affordable," Olivier Godement, product manager of OpenAI's API platform, told the media.

GPT-4o mini can help employees concentrate

Godement said some developers have been experimenting with the model over the past week.

OpenAI had the fintech startup Ramp test the model, using GPT-4o Mini to build a tool to extract expense data on receipts. So users can upload photos of their receipts and the model will sort the data for them. Email client Superhuman also tested GPT-4o Mini and used it to create a feature that automatically suggests email replies.

Initially, the GPT-4o mini will be able to process and generate text and images. Once the final version is completed, OpenAI says it will be able to handle other types of content.

OpenAI also said that GPT-4o mini is the company’s first AI model to use its new safety strategy called “instruction hierarchy.” The goal of this approach is to make AI systems prioritize certain instructions — such as those from companies — to make it harder for people to make tools do things they shouldn’t.

Analysts believe that the GPT-4o mini model is part of OpenAI's commitment to "multimodality," which is to provide a wide range of AI-generated media (such as text, images, audio, and video) in one tool: ChatGPT.

Last year, OpenAI COO Brad Lightcap told the media:

“The world is multimodal. If you think about the way we as humans process and engage with the world, we see things, we hear things, we speak — the world is more than just text. So having just text and code as a single modality, a single interface for us, is always going to feel incomplete because the power of these models and what they can do is so much more than that.”