o1 (generative pre-trained transformer)

(Redirected from OpenAI o1)

o1 is a generative pre-trained transformer released by OpenAI in September 2024. o1 spends time thinking before it answers, making it more efficient in complex reasoning tasks, science and programming.[1]

o1
Developer(s)OpenAI
Initial releaseSeptember 12, 2024; 3 days ago (2024-09-12)
TypeGenerative pre-trained transformer
Websiteopenai.com/o1/ Edit this on Wikidata

History

edit

Background

edit

According to leaked information, o1 was formerly known within OpenAI as "Q*", and later as "Strawberry".[2] The codename "Q*" first surfaced in November 2023, around the time of Sam Altman's ousting and subsequent reinstatement, with rumors suggesting that this experimental model had shown promising results on mathematical benchmarks.[3] In July 2024, Reuters reported that OpenAI was developing a generative pre-trained transformer known as "Strawberry".[2]

Release

edit

"o1-preview" and "o1-mini" were released on September 12, 2024, for ChatGPT Plus and Team users.[1] GitHub started testing the integration of o1-preview in its Copilot service the same day.[4]

OpenAI noted that o1 is the first of a series of "reasoning" models, and that it was planning to add access to o1-mini to all ChatGPT free users. o1-preview is several times more expensive than GPT-4o.[5]

Capabilities

edit

According to OpenAI, o1 is trained on a new, specifically tailored training dataset and with a new optimization algorithm. The training leverages reinforcement learning.[5]

o1 spends additional time thinking before generating an answer, which makes it more effective for complex reasoning tasks, particularly in science and programming.[1] Compared to the previous GPT-4o model, the o1 model has been trained to generate long "chains of thought" before returning a final answer.[6]

o1-mini is faster and 80% cheaper than o1-preview. It is particularly suitable for programming and STEM-related tasks, but does not have the same "broad world knowledge" as o1-preview.[7]

OpenAI noted that o1's reasoning capabilities makes it better at applying the safety rules provided in the prompt's context window. OpenAI also reported that during a test, one instance of o1-preview exploited a misconfiguration to succeed at a task that should have been infeasible due to a bug.[8][9]

References

edit
  1. ^ a b c Metz, Cade (September 12, 2024). "OpenAI Unveils New ChatGPT That Can Reason Through Math and Science". The New York Times. Retrieved September 12, 2024.
  2. ^ a b Tong, Anna; Paul, Katie (July 15, 2024). "Exclusive: OpenAI working on new reasoning technology under code name 'Strawberry'". Reuters. Retrieved September 12, 2024.
  3. ^ "OpenAI researchers warned board of AI breakthrough ahead of CEO ouster, sources say". Reuters. November 23, 2023.
  4. ^ Peters, Jay (September 12, 2024). "GitHub has started testing OpenAI's o1-preview in GitHub Copilot". The Verge. Retrieved September 12, 2024.
  5. ^ a b Robison, Kylie (2024-09-12). "OpenAI releases o1, its first model with 'reasoning' abilities". The Verge. Retrieved 2024-09-15.
  6. ^ "Learning to Reason with LLMs". OpenAI. Archived from the original on September 12, 2024. Retrieved September 13, 2024.
  7. ^ "OpenAI o1-mini". OpenAI. September 12, 2024.
  8. ^ Coombes, Lloyd (2024-09-13). "OpenAI's new ChatGPT o1 model 'cheated' on an impossible test — here's what happened". Tom's Guide. Retrieved 2024-09-15.
  9. ^ "OpenAI o1 System Card" (PDF). OpenAI. September 12, 2024. pp. 16–17.