Nvidia Nemotron Super 49B

nvidia/Llama-3.3-Nemotron-Super-49B-v1

BackTry Model

Nvidia Nemotron Super 49B

nvidia/Llama-3.3-Nemotron-Super-49B-v1

BackTry Model

Llama-3.3-Nemotron-Super-49B-v1 is a model which offers a great tradeoff between model accuracy and efficiency. Efficiency (throughput) directly translates to savings. Using a novel Neural Architecture Search (NAS) approach, we greatly reduce the model's memory footprint, enabling larger workloads, as well as fitting the model on a single GPU at high workloads (H200). This NAS approach enables the selection of a desired point in the accuracy-efficiency tradeoff. For more information on the NAS approach, please refer to this paper.

Added Aug 8, 2025

Context Window

128.0K

Max Output

16.4K

Input Price (Auto)

$0.15/1M

Output Price (Auto)

$0.15/1M

Benchmarks

Performance metrics and benchmarks

Artificial Analysis

LMArena

Sourced from Artificial Analysis.

Intelligence Index

14.3

Providers

Auto routing is available for this model. Explicit provider selection is not available.

Loading provider options…

Nvidia Nemotron Super 49B

Nvidia Nemotron Super 49B

Benchmarks

Providers

Reasoning

Coding

Math

Knowledge