Small model optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization
Context Window
131.1K
Max Output
8.2K
Input Price (Auto)
$0.031/1M
Output Price (Auto)
$0.049/1M
Capabilities
Performance metrics and benchmarks
Sourced from Artificial Analysis.
Intelligence Index
9.7
Auto routing is available for this model. Explicit provider selection is not available.
Loading provider options…
GPQA Diamond
Graduate-level scientific reasoning
25.5%
Better than 4% of models compared
HLE
Humanity's Last Exam
5.2%
Better than 47% of models compared
IFBench
Instruction-following benchmark
26.2%
Better than 11% of models compared
T²-Bench Telecom
Conversational AI agents in dual-control scenarios
21.1%
Better than 24% of models compared
AA-LCR
Long context reasoning evaluation
2.0%
Better than 16% of models compared
SciCode
Python programming for scientific computing
5.2%
Better than 6% of models compared
LiveCodeBench
Contamination-free coding benchmark
8.3%
Better than 6% of models compared
AIME 2025
American Invitational Mathematics Examination 2025
3.3%
Better than 5% of models compared
AIME
American Invitational Mathematics Examination
6.7%
Better than 24% of models compared
MMLU-Pro
Professional and academic subject knowledge
34.7%
Better than 5% of models compared
Last updated May 15, 2026
Artificial AnalysisMath-500
Diverse mathematical problem solving benchmark
48.9%
Better than 13% of models compared