NVIDIA Nemotron 3 Ultra 550B A55B BF16

Name: NVIDIA Nemotron 3 Ultra 550B A55B BF16
Brand: Open Source
Author: Open Source

nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16

Open Source · chat · open-weights

GA Alert me on changes

Context

Max output

—

Input $/1M

$0.50

Output $/1M

$2.20

Modalities

text

Released

03 Jun 2026

License: other · nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16

AI summary

● machine-written

NVIDIA Nemotron 3 Ultra 550B: Open-weights MoE reasoning model

NVIDIA Nemotron 3 Ultra is an open-weights frontier reasoning model with 550B total parameters and 55B active parameters, built on a hybrid Transformer-Mamba mixture-of-experts architecture. It supports a 1M token context window and is designed for long-running agentic workflows including agent orchestration, coding agents, deep research, and complex enterprise tasks. The model is particularly strong at multi-step reasoning and planning with high-throughput inference for agent pipelines.

What's new

550B total parameters with 55B active (MoE architecture)
1M token context window
Hybrid Transformer-Mamba architecture
Suited for long-running agentic workflows and multi-step reasoning

Best for

Agent orchestration and agentic workflowsCoding agents and deep researchMulti-step reasoning and planningComplex enterprise tasks

Sources

Source: https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16