o1 (generative pre-trained transformer)

o1
Developer(s)	OpenAI
Initial release	September 12, 2024; 3 days ago
Type	Generative pre-trained transformer
Website	openai.com/o1/

o1 is a generative pre-trained transformer released by OpenAI in September 2024. o1 spends time thinking before it answers, making it more efficient in complex reasoning tasks, science and programming.^[1]

History

Background

According to leaked information, o1 was formerly known within OpenAI as "Q*", and later as "Strawberry".^[2] The codename "Q*" first surfaced in November 2023, around the time of Sam Altman's ousting and subsequent reinstatement, with rumors suggesting that this experimental model had shown promising results on mathematical benchmarks.^[3] In July 2024, Reuters reported that OpenAI was developing a generative pre-trained transformer known as "Strawberry".^[2]

Release

"o1-preview" and "o1-mini" were released on September 12, 2024, for ChatGPT Plus and Team users.^[1] GitHub started testing the integration of o1-preview in its Copilot service the same day.^[4]

OpenAI noted that o1 is the first of a series of "reasoning" models, and that it was planning to add access to o1-mini to all ChatGPT free users. o1-preview is several times more expensive than GPT-4o.^[5]

Capabilities

According to OpenAI, o1 is trained on a new, specifically tailored training dataset and with a new optimization algorithm. The training leverages reinforcement learning.^[5]

o1 spends additional time thinking before generating an answer, which makes it more effective for complex reasoning tasks, particularly in science and programming.^[1] Compared to the previous GPT-4o model, the o1 model has been trained to generate long "chains of thought" before returning a final answer.^[6]

o1-mini is faster and 80% cheaper than o1-preview. It is particularly suitable for programming and STEM-related tasks, but does not have the same "broad world knowledge" as o1-preview.^[7]

OpenAI noted that o1's reasoning capabilities makes it better at applying the safety rules provided in the prompt's context window. OpenAI also reported that during a test, one instance of o1-preview exploited a misconfiguration to succeed at a task that should have been infeasible due to a bug.^[8]^[9]

References

^ ^a ^b ^c Metz, Cade (September 12, 2024). "OpenAI Unveils New ChatGPT That Can Reason Through Math and Science". The New York Times. Retrieved September 12, 2024.
^ ^a ^b Tong, Anna; Paul, Katie (July 15, 2024). "Exclusive: OpenAI working on new reasoning technology under code name 'Strawberry'". Reuters. Retrieved September 12, 2024.
^ "OpenAI researchers warned board of AI breakthrough ahead of CEO ouster, sources say". Reuters. November 23, 2023.
^ Peters, Jay (September 12, 2024). "GitHub has started testing OpenAI's o1-preview in GitHub Copilot". The Verge. Retrieved September 12, 2024.
^ ^a ^b Robison, Kylie (2024-09-12). "OpenAI releases o1, its first model with 'reasoning' abilities". The Verge. Retrieved 2024-09-15.
^ "Learning to Reason with LLMs". OpenAI. Archived from the original on September 12, 2024. Retrieved September 13, 2024.
^ "OpenAI o1-mini". OpenAI. September 12, 2024.
^ Coombes, Lloyd (2024-09-13). "OpenAI's new ChatGPT o1 model 'cheated' on an impossible test — here's what happened". Tom's Guide. Retrieved 2024-09-15.
^ "OpenAI o1 System Card" (PDF). OpenAI. September 12, 2024. pp. 16–17.

[NYTimesInfo-1] Metz, Cade (September 12, 2024). "OpenAI Unveils New ChatGPT That Can Reason Through Math and Science". The New York Times. Retrieved September 12, 2024.

[:0-2] Tong, Anna; Paul, Katie (July 15, 2024). "Exclusive: OpenAI working on new reasoning technology under code name 'Strawberry'". Reuters. Retrieved September 12, 2024.

[3] "OpenAI researchers warned board of AI breakthrough ahead of CEO ouster, sources say". Reuters. November 23, 2023.

[4] Peters, Jay (September 12, 2024). "GitHub has started testing OpenAI's o1-preview in GitHub Copilot". The Verge. Retrieved September 12, 2024.

[:1-5] Robison, Kylie (2024-09-12). "OpenAI releases o1, its first model with 'reasoning' abilities". The Verge. Retrieved 2024-09-15.

[6] "Learning to Reason with LLMs". OpenAI. Archived from the original on September 12, 2024. Retrieved September 13, 2024.

[7] "OpenAI o1-mini". OpenAI. September 12, 2024.

[8] Coombes, Lloyd (2024-09-13). "OpenAI's new ChatGPT o1 model 'cheated' on an impossible test — here's what happened". Tom's Guide. Retrieved 2024-09-15.

[9] "OpenAI o1 System Card" (PDF). OpenAI. September 12, 2024. pp. 16–17.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]