Private AI
GLM-5.1 with extended thinking enabled. Ranks #1 in open source and #3 globally across SWE-Bench Pro, Terminal-Bench, and NL2Repo (as of April 2026). Excels at long-horizon tasks, running autonomously for up to 8 hours while refining strategies through thousands of iterations. Run at FP8.
Added Mar 27, 2026
Model weightsContext Window
200.0K
Max Output
131.1K
Avg output tokens (7d)
1.7K tokens
Input Price (Auto)
$0.79/1M
Output Price (Auto)
$2.73/1M
Cache Read (Auto)
$0.16/1M
Capabilities
Performance metrics and benchmarks
Sourced from Artificial Analysis.
Intelligence Index
40.2
Choose explicit providers for this model. Auto routing remains available as the default option.
Loading provider options…
Coding Index
55.8
GPQA Diamond
Graduate-level scientific reasoning
86.8%
Better than 91% of models compared
HLE
Humanity's Last Exam
28.0%
Better than 91% of models compared
IFBench
Instruction-following benchmark
76.3%
Better than 95% of models compared
T²-Bench Telecom
Conversational AI agents in dual-control scenarios
97.7%
Better than 98% of models compared
AA-LCR
Long context reasoning evaluation
62.3%
Better than 79% of models compared
SciCode
Python programming for scientific computing
43.8%
Better than 87% of models compared
Terminal-Bench Hard
Agentic coding and terminal use
43.2%
Last updated Jun 25, 2026
Artificial AnalysisBetter than 92% of models compared