news

GPT-4o mini tops the list, fine-tuning for free for 2 months! 2 million training tokens for free every day

2024-07-26

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina


New Intelligence Report

Editor: Peach

【New Wisdom Introduction】While Llama 3.1 405B behemoth was open sourced, OpenAI stole the spotlight again. From now on, 2 million training tokens per day will be used to fine-tune the model for free until September 23.

On the same day that Llama 3.1 was open sourced, OpenAI also opened it up.


GPT-4o mini can be fine-tuned for free, with 2 million training tokens available every day, for a limited time of 2 months (until September 23).


The developers who received the email excitedly told each other that they must grab such a big bargain as soon as possible.


On the other hand, GPT-4o mini also came out in the LMSYS ranking of the large model arena.

In the overall list, GPT-4o mini and GPT-4o tied for first place.


Altman himself excitedly said, "I have never been so excited about any evaluation. The performance of GPT-4o mini is so close to that of GPT-4o, but the price is only 1/20!"


At the same time, he said that fine-tuning of the GPT-4o mini is now online.


It is really unexpected that OpenAI can release such a powerful model for everyone to use for free.

Netizens once thought that this might be the most advanced phishing email.


2 million tokens per day, GPT-4o mini free fine-tuning

In the email, OpenAI announced that it has officially launched the GPT-4o mini fine-tuning function to make the latest small model perform better in specific use cases.

From July 23rd to September 23rd, developers can use 2 million training tokens for free every day.


The amount exceeding this amount will be charged at USD 3 per million tokens.

After the two-month free trial period ends, fine-tuning training will also be charged at $3 million tokens.


In addition, OpenAI gave the reasons in the email that it is worthwhile for everyone to switch from GPT-3.5 Turbo to GPT-4o mini:

- More affordable: GPT-4o mini’s input token fee is 90% lower than GPT-3.5 Turbo, and its output token fee is 80% lower. Even after the free period, the training cost of GPT-4o mini is half that of GPT-3.5 Turbo.


- Longer context: The training context length of GPT-4o mini is 65k Token, which is 4 times that of GPT-3.5 Turbo, and the inference context length is 128k Token, which is 8 times that of GPT-3.5 Turbo.

- Smarter and more capable: GPT-4o mini is smarter than GPT-3.5 Turbo and supports visual capabilities (although fine-tuning is currently limited to text).


Finally, the email also mentioned that the GPT-4o mini fine-tuning function will be open to corporate customers, as well as Tier 4 and Tier 5 developers, and access rights will be gradually expanded to users of all levels in the future.


For those who want to get their hands dirty, OpenAI has released a fine-tuning guide, which can be found here:

https://platform.openai.com/docs/guides/fine-tuning/fine-tuning-examples


Some netizens are not optimistic about this. They are saving our data to train and improve AI models.


"AKA, give me your private data and I'll charge you very little money."


User use cases

Netizens who have obtained the qualification can’t wait to start testing.

The developers fine-tuned GPT-4O Mini using a dataset of headlines in the style of The Economist.



He then compared the performance of GPT-4O, the original GPT-4O min model, and the fine-tuned model in generating headlines.


Small models dominate the list, comparable to GPT-4o

A week after the release of GPT-4o mini, its results in the large model rankings finally came out.

A total of 4K+ user votes were received, and the GPT-4o mini model directly climbed to the top of the list, tied for first place with GPT-4o.

The most important thing is, it’s 20 times cheaper!


This is good news for many developers, as they can build more powerful applications at a lower cost.


In the mathematics sub-field, GPT-4o mini's performance declined and ranked 9th.


In addition, in the hard prompt test, the GPT-4o mini still maintained a stable performance, second only to the GPT-4o and Claude 3.5 Sonnet.


In the field of encoding, GPT-4o mini also demonstrates powerful capabilities.


Many people have raised questions about why the GPT-4o mini ranks so high in Arena.


The official explanation is:

- Chatbot Arena is evaluated based on human preferences in different fields. We encourage everyone to not only pay attention to the overall leaderboard, but also check the rankings of individual categories (such as mathematics, coding, etc.).

- Arena evaluation is done in real time. You are encouraged to compare models in Arena and verify your assumptions in real time.

- Transparency is our core value; all code and analysis are open source (http://github.com/lm-sys/FastChat). We regularly release 20% of the data and keep the rest to avoid overfitting and maintain the integrity of the benchmark.

- We will release a random 20% of the GPT-4o mini battle data according to our policy, and everyone can check the answers for themselves.

However, other netizens believe that the victory of GPT-4O-Mini is substantial evidence that ordinary people are not that smart.

And, for the first time in history, AI has become smart enough to fool us. Kind of crazy, and kind of historic.



References:

https://x.com/moyix/status/1815840634013639086

https://x.com/HamelHusain/status/1815848198927434019

https://x.com/sama/status/1815877987696533897

https://x.com/0xSMW/status/1815869241205350641