Llama 4 Scout, a 17 billion active parameter model with 16 experts, is the best multimodal model in the world in its class and is more powerful than all previous generation Llama models, while fitting in a single H100 GPU. Additionally, Llama 4 Scout offers an industry-leading context window of 10M and delivers better results than Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 across a broad range of widely reported benchmarks.
Added Sep 5, 2025
Context Window
328.0K
Max Output
65.5K
Input Price (Auto)
$0.085/1M
Output Price (Auto)
$0.46/1M
Capabilities
Performance metrics and benchmarks
Sourced from Artificial Analysis.
Intelligence Index
13.5
Auto routing is available for this model. Explicit provider selection is not available.
Loading provider options…
Coding Index
6.7
GPQA Diamond
Graduate-level scientific reasoning
58.7%
Better than 45% of models compared
HLE
Humanity's Last Exam
4.3%
Better than 23% of models compared
IFBench
Instruction-following benchmark
39.5%
Better than 45% of models compared
T²-Bench Telecom
Conversational AI agents in dual-control scenarios
15.5%
Better than 16% of models compared
AA-LCR
Long context reasoning evaluation
25.8%
Better than 45% of models compared
SciCode
Python programming for scientific computing
17.0%
Better than 19% of models compared
Terminal-Bench Hard
Agentic coding and terminal use
1.5%
Better than 19% of models compared
AIME 2025
American Invitational Mathematics Examination 2025
14.0%
Better than 18% of models compared
AIME
American Invitational Mathematics Examination
28.3%
Better than 55% of models compared
MMLU-Pro
Professional and academic subject knowledge
75.2%
Better than 50% of models compared
Last updated May 15, 2026
Artificial AnalysisLiveCodeBench
Contamination-free coding benchmark
29.9%
Better than 35% of models compared
Math-500
Diverse mathematical problem solving benchmark
84.4%
Better than 51% of models compared