A request that this article title be changed to OpenAI o1 is under discussion. Please do not move this article until the discussion is closed. |
o1 is a generative pre-trained transformer released by OpenAI in September 2024. o1 spends time thinking before it answers, making it more efficient in complex reasoning tasks, science and programming.[1]
Developer(s) | OpenAI |
---|---|
Initial release | September 12, 2024 |
Type | Generative pre-trained transformer |
Website | openai |
History
editBackground
editAccording to leaked information, o1 was formerly known within OpenAI as "Q*", and later as "Strawberry".[2] The codename "Q*" first surfaced in November 2023, around the time of Sam Altman's ousting and subsequent reinstatement, with rumors suggesting that this experimental model had shown promising results on mathematical benchmarks.[3] In July 2024, Reuters reported that OpenAI was developing a generative pre-trained transformer known as "Strawberry".[2]
Release
edit"o1-preview" and "o1-mini" were released on September 12, 2024, for ChatGPT Plus and Team users.[1] GitHub started testing the integration of o1-preview in its Copilot service the same day.[4]
OpenAI noted that o1 is the first of a series of "reasoning" models, and that it was planning to add access to o1-mini to all ChatGPT free users. o1-preview is several times more expensive than GPT-4o.[5]
Capabilities
editAccording to OpenAI, o1 is trained on a new, specifically tailored training dataset and with a new optimization algorithm. The training leverages reinforcement learning.[5]
o1 spends additional time thinking before generating an answer, which makes it more effective for complex reasoning tasks, particularly in science and programming.[1] Compared to the previous GPT-4o model, the o1 model has been trained to generate long "chains of thought" before returning a final answer.[6]
o1-mini is faster and 80% cheaper than o1-preview. It is particularly suitable for programming and STEM-related tasks, but does not have the same "broad world knowledge" as o1-preview.[7]
OpenAI noted that o1's reasoning capabilities makes it better at applying the safety rules provided in the prompt's context window. OpenAI also reported that during a test, one instance of o1-preview exploited a misconfiguration to succeed at a task that should have been infeasible due to a bug.[8][9]
References
edit- ^ a b c Metz, Cade (September 12, 2024). "OpenAI Unveils New ChatGPT That Can Reason Through Math and Science". The New York Times. Retrieved September 12, 2024.
- ^ a b Tong, Anna; Paul, Katie (July 15, 2024). "Exclusive: OpenAI working on new reasoning technology under code name 'Strawberry'". Reuters. Retrieved September 12, 2024.
- ^ "OpenAI researchers warned board of AI breakthrough ahead of CEO ouster, sources say". Reuters. November 23, 2023.
- ^ Peters, Jay (September 12, 2024). "GitHub has started testing OpenAI's o1-preview in GitHub Copilot". The Verge. Retrieved September 12, 2024.
- ^ a b Robison, Kylie (2024-09-12). "OpenAI releases o1, its first model with 'reasoning' abilities". The Verge. Retrieved 2024-09-15.
- ^ "Learning to Reason with LLMs". OpenAI. Archived from the original on September 12, 2024. Retrieved September 13, 2024.
- ^ "OpenAI o1-mini". OpenAI. September 12, 2024.
- ^ Coombes, Lloyd (2024-09-13). "OpenAI's new ChatGPT o1 model 'cheated' on an impossible test — here's what happened". Tom's Guide. Retrieved 2024-09-15.
- ^ "OpenAI o1 System Card" (PDF). OpenAI. September 12, 2024. pp. 16–17.