On Tuesday, OpenAI introduced that o3-pro, a brand new model of its most succesful simulated reasoning mannequin, is now obtainable to ChatGPT Professional and Staff customers, changing o1-pro within the mannequin picker. The corporate additionally decreased API pricing for o3-pro by 87 % in comparison with o1-pro whereas chopping o3 costs by 80 %. Whereas “reasoning” is helpful for some analytical duties, new research have posed basic questions on what the phrase really means when utilized to those AI techniques.
We’ll take a deeper take a look at “reasoning” in a minute, however first, let’s look at what’s new. Whereas OpenAI initially launched o3 (non-pro) in April, the o3-pro mannequin focuses on arithmetic, science, and coding whereas including new capabilities like net search, file evaluation, picture evaluation, and Python execution. Since these software integrations sluggish response occasions (longer than the already sluggish o1-pro), OpenAI recommends utilizing the mannequin for advanced issues the place accuracy issues greater than pace. Nevertheless, they don’t essentially confabulate much less than “non-reasoning” AI fashions (they nonetheless introduce factual errors), which is a major caveat when looking for correct outcomes.
Past the reported efficiency enhancements, OpenAI introduced a considerable value discount for builders. O3-pro prices $20 per million enter tokens and $80 per million output tokens within the API, making it 87 % cheaper than o1-pro. The corporate additionally decreased the value of the usual o3 mannequin by 80 %.
These reductions tackle one of many predominant issues with reasoning fashions—their excessive price in comparison with commonplace fashions. The unique o1 price $15 per million enter tokens and $60 per million output tokens, whereas o3-mini price $1.10 per million enter tokens and $4.40 per million output tokens.
Why use o3-pro?
Not like general-purpose fashions like GPT-4o that prioritize pace, broad data, and making customers be ok with themselves, o3-pro makes use of a chain-of-thought simulated reasoning course of to commit extra output tokens towards working by means of advanced issues, making it typically higher for technical challenges that require deeper evaluation. Nevertheless it’s nonetheless not excellent.