Advanced research-focused model that does deep, multi-step reasoning on complex research tasks. Best for comprehensive analysis and investigation.
Added Mar 15, 2026
Context Window
200.0K
Max Output
100.0K
Input Price (Auto)
$2.20/1M
Output Price (Auto)
$8.80/1M
Cache Read (Auto)
$1.10/1M
Capabilities
Performance metrics and benchmarks
Sourced from Artificial Analysis.
Intelligence Index
33.1
Auto routing is available for this model. Explicit provider selection is not available.
Loading provider options…
Coding Index
25.6
GPQA Diamond
Graduate-level scientific reasoning
78.4%
Better than 79% of models compared
HLE
Humanity's Last Exam
17.5%
Better than 86% of models compared
IFBench
Instruction-following benchmark
68.7%
Better than 87% of models compared
T²-Bench Telecom
Conversational AI agents in dual-control scenarios
55.6%
Better than 63% of models compared
AA-LCR
Long context reasoning evaluation
55.0%
Better than 73% of models compared
SciCode
Python programming for scientific computing
46.5%
Better than 94% of models compared
Terminal-Bench Hard
Agentic coding and terminal use
15.2%
Better than 59% of models compared
AIME 2025
American Invitational Mathematics Examination 2025
90.7%
Better than 91% of models compared
AIME
American Invitational Mathematics Examination
94.0%
Better than 98% of models compared
MMLU-Pro
Professional and academic subject knowledge
83.2%
Better than 84% of models compared
Last updated May 15, 2026
Artificial AnalysisLiveCodeBench
Contamination-free coding benchmark
85.9%
Better than 96% of models compared
Math-500
Diverse mathematical problem solving benchmark
98.9%
Better than 97% of models compared