Qwen3.6-27B

Alibaba · Qwen3

Alibaba's open-weight 27B Qwen 3.6 model for agentic coding, vision-language work, and self-hosted deployment.

Part of Qwen3 family · Other versions: Qwen 3.6 Max Preview , Qwen3-Max
Type
multimodal
Context
262K tokens
Max Output
262K tokens
Status
current
API Access
Yes
License
Apache 2.0
chinese multilingual coding agentic vision open-weights long-context
Released April 2026 · Updated May 16, 2026

Overview

Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last verified on May 16, 2026.

Qwen3.6-27B is Alibaba’s dense open-weight entry in the Qwen 3.6 generation. It sits below the hosted Qwen 3.6 Max Preview but fills an important gap: a self-hostable, Apache 2.0, vision-language-capable model with strong agentic coding positioning and a size that is more practical than the largest hosted or MoE tiers.

The right way to read this page is “deployment-flexible Qwen 3.6.” If you want the highest hosted Qwen capability, use Max Preview. If you want open weights that can run through Transformers, vLLM, SGLang, Docker Model Runner, llama.cpp-derived tooling, or local app stacks, this 27B release is the more useful reference.

Capabilities

Alibaba’s model card emphasizes agentic coding, repository-level reasoning, and frontend workflow improvements. The card also lists Qwen3.6-27B as an image-text-to-text model, so it is not just a text-only coding checkpoint.

The model keeps Qwen’s hybrid reasoning direction while improving real-world developer behavior: code edits, tool use, long-context iteration, and visual understanding. It is especially relevant for Chinese-English teams that want a modern open model instead of routing every hard request to a closed western API.

Technical Details

Public anchors at this snapshot:

  • 27B language-model parameters.
  • Causal language model with a vision encoder.
  • Native 262,144-token context length, with documentation describing extension up to 1,010,000 tokens.
  • Apache 2.0 license on Hugging Face.
  • Compatible with Transformers, vLLM, SGLang, KTransformers, Docker Model Runner, and local deployment paths.

The repo records maxOutput as the native context ceiling because the public card does not expose a separate provider-published generation cap in the same way hosted APIs do. Production serving should still set explicit max-new-token, memory, and latency budgets.

Pricing & Access

There is no single canonical hosted API price for this open-weight release. Cost depends on where you run it: self-hosted GPU infrastructure, Hugging Face endpoints, third-party inference providers, or Alibaba-compatible hosted routes where available.

Access options:

  • Hugging Face under the official Qwen organization.
  • Self-hosted inference through Transformers, vLLM, SGLang, KTransformers, and Docker Model Runner.
  • Local model tooling that supports Qwen3.6-compatible checkpoints and quantizations.

Best Use Cases

Use Qwen3.6-27B for self-hosted coding assistants, bilingual Chinese-English developer workflows, local or private vision-language analysis, and teams that want a modern open Qwen model without paying hosted flagship rates.

It is less natural for teams that need managed enterprise controls, strict hosted SLAs, or the absolute highest Qwen capability. In those cases, Qwen 3.6 Max Preview or another managed frontier model is easier to operate.

Comparisons

  • Qwen 3.6 Max Preview (Alibaba): Higher hosted frontier tier; Qwen3.6-27B trades peak capability for open weights and deployment control.
  • Kimi K2.6 (Moonshot AI): Larger open-weight Chinese coding model with explicit agent-swarm framing; Qwen3.6-27B is denser and smaller for local deployment.
  • Mistral Medium 3.5 (Mistral AI): Western open-weight alternative with stronger European positioning; Qwen3.6-27B is more attractive for Chinese-language and Qwen ecosystem workflows.