Benchmarks
Top models across a combined benchmark plus Artificial Analysis, LMArena, LiveBench, FrontierCode, Epoch AI, ARC Prize, EQ-Bench, Design Arena, and NanoGPT benchmark categories.
Combined
Equal-weight blend of Artificial Analysis Intelligence Index, LMArena Overall, LiveBench Overall, NanoGPT Usage Share. Each source is min-max normalized to 0-100 across its current leaderboard and weighted at 25%. Missing or unavailable source entries contribute 0.
Top 20 price vs performance
X-axis: $/M blended tokens
Best value frontier: No cheaper model has a better benchmark result.
1.
by Anthropic
84.5%
2.
by Anthropic
63.4%
3.
by OpenAI
55.9%
4.
by Anthropic
50.2%
5.
by Zhipu
49.5%
6.
by Gemini
44.4%
7.
by OpenAI
44.0%
8.
by Anthropic
36.6%
9.
by Gemini
28.5%
10.
by Anthropic
27.5%
Weighted blend of latest source snapshots
NanoGPT Composite