Private AI
DeepSeek V4 Flash Thinking enables DeepSeek's reasoning mode on the efficiency-optimized Mixture-of-Experts model with a 1M-token context window, built for fast inference, high-throughput workloads, reasoning, coding, and agent workflows.
Added Apr 24, 2026
Context Window
1.0M
Max Output
384.0K
Input Price (Auto)
$0.10/1M
Output Price (Auto)
$0.21/1M
Cache Read (Auto)
$0.021/1M
Capabilities
Performance metrics and benchmarks
Sourced from Artificial Analysis.
Intelligence Index
46.5
Choose explicit providers for this model. Auto routing remains available as the default option.
Loading provider options…
Coding Index
38.7
GPQA Diamond
Graduate-level scientific reasoning
89.4%
Better than 95% of models compared
HLE
Humanity's Last Exam
32.1%
Better than 94% of models compared
IFBench
Instruction-following benchmark
79.2%
Better than 98% of models compared
T²-Bench Telecom
Conversational AI agents in dual-control scenarios
95.0%
Better than 93% of models compared
AA-LCR
Long context reasoning evaluation
63.0%
Better than 79% of models compared
SciCode
Python programming for scientific computing
44.9%
Better than 89% of models compared
Terminal-Bench Hard
Agentic coding and terminal use
35.6%
Last updated May 31, 2026
Artificial AnalysisBetter than 84% of models compared