Private AI
Nvidia's Nemotron 3 Ultra 550B A55B model from the Nemotron 3 family. It uses a hybrid Mamba-Transformer MoE architecture. Provider-specific context limits vary, with the longest current route supporting up to 1M context. Thinking enabled.
Added Jun 4, 2026
Model weightsContext Window
1.0M
Max Output
65.5K
Avg output tokens (7d)
2.0K tokens
Input Price (Auto)
$0.53/1M
Output Price (Auto)
$2.63/1M
Cache Read (Auto)
$0.16/1M
Capabilities
Performance metrics and benchmarks
Sourced from Artificial Analysis.
Intelligence Index
37.8
Choose explicit providers for this model. Auto routing remains available as the default option.
Loading provider options…
Coding Index
49.3
GPQA Diamond
Graduate-level scientific reasoning
86.7%
Better than 90% of models compared
HLE
Humanity's Last Exam
26.6%
Better than 90% of models compared
IFBench
Instruction-following benchmark
81.4%
Better than 99% of models compared
T²-Bench Telecom
Conversational AI agents in dual-control scenarios
83.3%
Better than 73% of models compared
AA-LCR
Long context reasoning evaluation
67.0%
Better than 89% of models compared
SciCode
Python programming for scientific computing
39.9%
Better than 77% of models compared
Terminal-Bench Hard
Agentic coding and terminal use
36.4%
Last updated Jun 24, 2026
Artificial AnalysisBetter than 85% of models compared