Qwen3.6 35B A3B NVFP4

Name: Qwen3.6 35B A3B NVFP4
Brand: Open Source
Author: Open Source

nvidia/Qwen3.6-35B-A3B-NVFP4

Open Source · chat · open-weights

GA Alert me on changes

Context

262.1K

Max output

—

Input $/1M

$0.14

Output $/1M

$1.00

Modalities

text

Released

27 May 2026

License: apache-2.0 · nvidia/Qwen3.6-35B-A3B-NVFP4

AI summary

● machine-written

Alibaba releases Qwen3.6 35B A3B open-weight model with 262K context

Qwen3.6 35B A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token. It uses a hybrid sparse mixture-of-experts architecture and supports a 262K token native context window (extensible to 1M via YaRN), accepting text, image, and video inputs. The model includes integrated thinking mode, function calling, and structured output capabilities under the Apache 2.0 license.

What's new

Hybrid sparse mixture-of-experts architecture combining Gated DeltaNet linear attention with standard gated attention
262K native context window, extensible to 1M tokens via YaRN
Accepts text, image, and video inputs
Integrated thinking mode with reasoning traces preserved across multi-turn conversations

Best for

Multimodal tasks requiring image and video understandingLong-context document processing and analysisFunction calling and structured output generationEfficient inference requiring lower compute resources

Sources

Source: https://huggingface.co/nvidia/Qwen3.6-35B-A3B-NVFP4