Nvidia Nemotron 3 Ultra 550B Thinking

nvidia/nemotron-3-ultra-550b-a55b:thinking

BackTry Model

Nvidia Nemotron 3 Ultra 550B Thinking

nvidia/nemotron-3-ultra-550b-a55b:thinking

BackTry Model

Nvidia's Nemotron 3 Ultra 550B A55B model from the Nemotron 3 family. It uses a hybrid Mamba-Transformer MoE architecture. Provider-specific context limits vary, with the longest current route supporting up to 1M context. Thinking enabled.

Added Jun 4, 2026

Model weights

Context Window

1.0M

Max Output

65.5K

Avg output tokens (7d)

2.0K tokens

75%

Input Price (Auto)

$0.53/1M

Output Price (Auto)

$2.63/1M

Cache Read (Auto)

$0.26/1M

Capabilities

Benchmarks

Performance metrics and benchmarks

Sourced from Artificial Analysis.

Intelligence Index

37.8

Providers

Choose explicit providers for this model. Auto routing remains available as the default option.

Loading provider options…

Nvidia Nemotron 3 Ultra 550B Thinking

Nvidia Nemotron 3 Ultra 550B Thinking

Benchmarks

Providers

Reasoning

Coding