Skip to content

Qwen3.6 35B A3B NVFP4

nvidia/Qwen3.6-35B-A3B-NVFP4
Open Source · chat · open-weights
GA Alert me on changes
Context
262.1K
Max output
Input $/1M
$0.14
Output $/1M
$1.00
Modalities
text
Released
27 May 2026
License: apache-2.0 · nvidia/Qwen3.6-35B-A3B-NVFP4
AI summary
● machine-written

Alibaba releases Qwen3.6 35B A3B open-weight model with 262K context

Qwen3.6 35B A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token. It uses a hybrid sparse mixture-of-experts architecture and supports a 262K token native context window (extensible to 1M via YaRN), accepting text, image, and video inputs. The model includes integrated thinking mode, function calling, and structured output capabilities under the Apache 2.0 license.

What's new
  • Hybrid sparse mixture-of-experts architecture combining Gated DeltaNet linear attention with standard gated attention
  • 262K native context window, extensible to 1M tokens via YaRN
  • Accepts text, image, and video inputs
  • Integrated thinking mode with reasoning traces preserved across multi-turn conversations
Best for
Multimodal tasks requiring image and video understandingLong-context document processing and analysisFunction calling and structured output generationEfficient inference requiring lower compute resources
Sources

Source: https://huggingface.co/nvidia/Qwen3.6-35B-A3B-NVFP4