In a significant move that could reshape the landscape of artificial intelligence development, OpenAI has announced an impressive 80% price reduction for its flagship reasoning large language model (LMM), o3. This substantial cut affects both input and output tokens, making the model more accessible to developers who are eager to harness advanced reasoning capabilities for their projects.

To clarify, tokens are the individual numeric strings that large language models use to represent various forms of content, including words, phrases, mathematical equations, and even code snippets. These tokens serve as the fundamental building blocks of the model’s understanding and response generation. OpenAI, like many other LLM providers, offers its models through application programming interfaces (APIs) that developers can utilize to create applications or integrate external tools, typically charging fees based on the number of tokens processed.

The new pricing structure positions OpenAI’s o3 as a competitive option in the burgeoning AI marketplace, where cost and performance are increasingly important. This change is particularly noteworthy as it directly challenges rival models, such as Gemini 2.5 Pro from Google DeepMind, Claude Opus 4 from Anthropic, and the DeepSeek reasoning suite.

The announcement was made by Sam Altman, CEO of OpenAI, in a recent post on social media platform X. In his message, he expressed excitement about the new pricing, stating: “we dropped the price of o3 by 80%!! excited to see what people will do with it now. think you’ll also be happy with o3-pro pricing for the performance :)” This enthusiasm is shared by many in the developer community, as evidenced by posts celebrating the price drop.

Under the new pricing model, developers will now pay $2 for every million input tokens and $8 for every million output tokens. Additionally, users can benefit from a further discount—reducing the cost to $1.50 for input tokens—when they use information already stored in cache, making repeated tasks more economical. To put this in perspective, the previous rates were $10 for input and $40 for output, representing a major shift in affordability.

Ray Fernando, a developer and early adopter of the model, encapsulated the excitement in a succinct post saying, “LFG!”—an abbreviation for “let’s f***ing go!”—which reflects the growing enthusiasm for utilizing AI without the burden of high access costs.

This pricing strategy emerges at a time when AI providers are fiercely competing not only on performance metrics but also on cost-effectiveness. A look at the pricing of leading AI reasoning models underscores the significance of OpenAI's decision:

  • Gemini 2.5 Pro: Developed by Google DeepMind, this model charges between $1.25 and $2.50 for input tokens, depending on the size of the prompt, and between $10 and $15 for outputs. It also offers integration with Google Search, although this comes with additional costs.
  • Claude Opus 4: Marketed by Anthropic, this model is priced at $15 per million input tokens and a steep $75 for outputs, making it one of the most expensive options available. It does offer discounts for bulk processing.
  • DeepSeek Models: These models, including DeepSeek-Reasoner and DeepSeek-Chat, are priced significantly lower, with input tokens costing between $0.07 and $0.55 depending on various factors.

A comparative table illustrates these pricing figures:

ModelInput CostCached InputOutput CostDiscount Notes
OpenAI o3$2.00 (down from $10.00)$0.50$8.00 (down from $40.00)Flex Processing: $5 / $20
Gemini 2.5 Pro$1.25 – $2.50$0.31 – $0.625$10.00 – $15.00Higher rate for prompts >200k tokens
Claude Opus 4$15.00$1.50 (read) / $18.75 (write)$75.0050% off with batch processing
DeepSeek-Chat$0.07 (hit), $0.27 (miss)$1.1050% off during off-peak hours
DeepSeek-Reasoner$0.14 (hit), $0.55 (miss)$2.1975% off during off-peak hours

Moreover, a third-party research group, Artificial Analysis, conducted benchmarking tests on the new o3 model and found that it cost $390 to complete a full suite of tests, significantly lower than the $971 for Gemini 2.5 Pro and comparable to $342 for Claude 4 Sonnet. This highlights how OpenAI is effectively narrowing the cost versus intelligence gap for developers.

OpenAI’s pricing alterations not only reduce the financial barriers for accessing high-performance models but also introduce a flex mode for synchronous processing, allowing for charges of $5 for input and $20 for output. This flexibility enables developers to optimize their compute costs based on the type of workload.

Currently, the o3 model can be accessed through the OpenAI API and Playground, opening doors for startups, research teams, and individual developers who previously faced prohibitive costs for utilizing advanced AI technologies. By lowering the price of its most sophisticated reasoning model, OpenAI is signaling a positive trend in the generative AI sector: premium performance is becoming more accessible, empowering developers with a range of economically scalable options.