Great multilingual support, strong at mathematics and coding, supports roleplay and chatbots.
Added Jul 3, 2025
Context Window
131.1K
Max Output
8.2K
Input Price (Auto)
$0.37/1M
Output Price (Auto)
$0.43/1M
Performance metrics and benchmarks
Sourced from Artificial Analysis.
Intelligence Index
16.3
Choose explicit providers for this model. Auto routing remains available as the default option.
Loading provider options…
GPQA Diamond
Graduate-level scientific reasoning
58.7%
Better than 45% of models compared
HLE
Humanity's Last Exam
4.5%
Better than 29% of models compared
CritPt
Research-level physics reasoning
0.0%
Better than 36% of models compared
SciCode
Python programming for scientific computing
33.7%
Better than 61% of models compared
LiveCodeBench
Contamination-free coding benchmark
35.9%
Better than 43% of models compared
AIME
American Invitational Mathematics Examination
23.3%
Better than 50% of models compared
Math-500
Diverse mathematical problem solving benchmark
83.5%
Better than 50% of models compared
MMLU-Pro
Professional and academic subject knowledge
76.2%
Better than 53% of models compared
AA-Omniscience Accuracy
Proportion of correctly answered questions
17.6%
Better than 46% of models compared
Last updated May 15, 2026
Artificial AnalysisAA-Omniscience Hallucination Rate
Rate of incorrect answers among non-correct responses
84.7%
Better than 41% of models compared