Benchmarks

Top models across a combined benchmark plus Artificial Analysis, LMArena, LiveBench, FrontierCode, Epoch AI, ARC Prize, EQ-Bench, Design Arena, and NanoGPT benchmark categories.

Combined

Equal-weight blend of Artificial Analysis Intelligence Index, LMArena Overall, LiveBench Overall, NanoGPT Usage Share. Each source is min-max normalized to 0-100 across its current leaderboard and weighted at 25%. Missing or unavailable source entries contribute 0.

Top 20 price vs performance

X-axis: $/M blended tokens

Best value frontier: No cheaper model has a better benchmark result.

1.

Claude Fable 5
Anthropic logo

by Anthropic

84.5%

Best value

2.

Claude Opus 4.8
Anthropic logo

by Anthropic

63.4%

Best value

3.

GPT 5.5
OpenAI logo

by OpenAI

55.9%

4.

Claude 4.7 Opus
Anthropic logo

by Anthropic

50.2%

5.

GLM 5.2
Zhipu logo

by Zhipu

49.5%

Best value

6.

44.4%

7.

GPT 5.4
OpenAI logo

by OpenAI

44.0%

8.

Claude 4.6 Opus
Anthropic logo

by Anthropic

36.6%

9.

28.5%

10.

Claude Sonnet 5
Anthropic logo

by Anthropic

27.5%

Weighted blend of latest source snapshots

NanoGPT Composite