news

how does o1 “reason” like a human? q&a with the openai research team: revealing model details, future plans, and tips for maximizing o1 performance

2024-09-15

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

on september 13, openai announced the official launch of the o1 series model, marking a new era in the field of ai and the arrival of large models that can perform general and complex reasoning tasks. after the model was released, the openai research team held an ama (ask me anything) event on the x social platform, revealing many details of the o1 series model.

image source: x social platform
  • "alien" level ai assistant

openai said the o1 series includes two versions: o1-preview, which is an early iteration of the full model, and o1-mini, a faster, lightweight version.o1 is able to generate a long chain of hidden thought processes before giving a final answer, demonstrating human-like reasoning abilityresearchers compare o1 to an "alien" with super powers

image source: x social platform

in the process of reasoning,o1 uses reinforcement learning to achieve the performance of "reasoning"although there are no current plans to show these thought chain tokens to api users or chatgpt, the researchers revealed that instructions embedded in prompts can influence the way o1 thinks.gpt-4o cannot reach o1’s thought chain performance level with prompts

image source: x social platform

the o1 series models use the same tokenizer as gpt-4o and are consistent in input tokens. however,o1 is able to handle longer, more open tasks, reducing the need for input chunking. in the future, o1 will also support larger input context windows.

o1 also demonstrated impressive reasoning and generalization abilities, such as breaking codes, pondering philosophical questions, and assessing his own abilities through self-tests.

the research team also revealed thato1-preview performs comparable to or slightly better than gpt-4o on some personalized writing tasks.

  • the mini version is even more powerful?

compared with o1-preview, o1-mini is optimized in size and speed.

image source: x social platform

although there may be limitations in some areas such as world knowledge,o1-mini demonstrates its strengths in stem (science, technology, engineering, mathematics) tasks and code-related tasks.also,o1-mini can explore more thought chains than o1-preview

image source: x social platform
  • o1 will soon support tool integration and multimodal understanding

although o1-preview does not currently use tools,openai plans to add capabilities such as function calls, code interpreters, and web browsing. tool support, structured output, and system prompts will also be released in future updates.

image source: x social platform

in addition, the openai developer team stated,in the future, users will be able to control o1’s thinking time and token limits., and promised to actively promote the realization of this function.

openai is also actively promoting streaming support and inference progress feedback in the api.also,o1 already has built-in multimodal capabilities.it is expected to reach the state-of-the-art level on the multimodal learning task (mmlu).

image source: x social platform
  • o1-mini has a 50 tip limit per week

o1-mini is currently open to chatgpt plus users, but there is a limit of 50 tips per week.all tips count towards the same quota. openai promises,we will gradually increase api access levels and rate limits in the future, and provide volume pricing discounts after restrictions are relaxed.

the pricing of the o1 model is expected to follow the trend of price reduction every 1-2 years.also,personalized fine-tuning support is already on the product roadmap, but the specific release schedule is still unclear.

image source: x social platform
  • tips for maximizing o1 performance

o1-mini is currently trained using data up to october 2023, and future iterations will augment its world knowledge with newer datasets.

in order to give full play to the reasoning advantages of o1,the team recommends that users design prompts that are informative, provide concrete examples that cover edge cases, and clearly specify the required reasoning steps and style.but please note thatirrelevant context may interfere with the model’s reasoning process

daily economic news compiled from public information

daily economic news

report/feedback