Gemini 3.1 Pro preview is built for tasks where simple answers are not enough. Stronger core reasoning for complex coding, math, and long-context workflows, with multimodal support and a reported 77.1% verified score on ARC-AGI-2. NOTE: Inputs > 200k tokens are charged at 2x input and 1.5x output rates.
Added Feb 19, 2026
Context Window
1.0M
Max Output
65.5K
Input Price (Auto)
$2.00/1M
Output Price (Auto)
$12.00/1M
Cache Read (Auto)
$0.20/1M
Capabilities
Performance metrics and benchmarks
Sourced from Artificial Analysis.
Intelligence Index
57.2
Auto routing is available for this model. Explicit provider selection is not available.
Loading provider options…
Coding Index
55.5
Agentic Index
59.1
GPQA Diamond
Graduate-level scientific reasoning
94.1%
Better than 99% of models compared
HLE
Humanity's Last Exam
44.7%
Better than 99% of models compared
IFBench
Instruction-following benchmark
77.1%
Better than 97% of models compared
T²-Bench Telecom
Conversational AI agents in dual-control scenarios
95.6%
Better than 96% of models compared
AA-LCR
Long context reasoning evaluation
72.7%
Better than 97% of models compared
GDPval-AA
Economically valuable tasks
40.7%
Better than 86% of models compared
CritPt
Research-level physics reasoning
17.7%
Better than 98% of models compared
SciCode
Python programming for scientific computing
58.9%
Better than 99% of models compared
Terminal-Bench Hard
Agentic coding and terminal use
53.8%
Better than 98% of models compared
AA-Omniscience Accuracy
Proportion of correctly answered questions
55.2%
Better than 99% of models compared
AA-Omniscience Hallucination Rate
Rate of incorrect answers among non-correct responses
49.9%
Better than 89% of models compared
Last updated May 11, 2026
Artificial Analysis