Benchmarks
Top models across a combined benchmark plus Artificial Analysis, LMArena, LiveBench, FrontierCode, Epoch AI, ARC Prize, EQ-Bench, Design Arena, and NanoGPT benchmark categories.
Combined
Equal-weight blend of Artificial Analysis Intelligence Index, LMArena Overall, LiveBench Overall, NanoGPT Usage Share. Each source is min-max normalized to 0-100 across its current leaderboard and weighted at 25%. Missing or unavailable source entries contribute 0.
Top 20 price vs performance
X-axis: $/M blended tokens
Best value frontier: No cheaper model has a better benchmark result.
1.
by OpenAI
63.7%
2.
by Anthropic
59.4%
3.
by Gemini
56.2%
4.
by Anthropic
52.4%
5.
by OpenAI
48.7%
6.
by Anthropic
46.4%
7.
by Gemini
29.6%
8.
by Qwen
28.4%
9.
25.0%
10.
17.4%
Weighted blend of latest source snapshots
NanoGPT Composite