GLM-5.1

Z.ai · GLM

Z.ai's GLM-5.1, a 744B-parameter MoE open-weight model with strong autonomous coding and tool-use behavior.

Part of GLM family · Other versions: GLM-5
Type
language
Context
203K tokens
Max Output
66K tokens
Status
current
Input
$1.4/1M tok
Output
$4.4/1M tok
API Access
Yes
License
MIT
chinese reasoning coding agentic open-weights long-context tool-use
Released April 2026 · Updated May 1, 2026

Overview

Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last verified on May 1, 2026.

GLM-5.1 is Z.ai’s post-training upgrade to GLM-5, the frontier model the company shipped on February 11, 2026. The 5.1 variant became available through Z.ai’s API on March 27, 2026, and was released as open-source weights on April 7, 2026. It keeps the GLM-5 base architecture (a 744B-parameter Mixture-of-Experts with 44B active per token) while substantially improving coding, tool use, and autonomous execution behavior. Z.ai claims the model can sustain coherent autonomous work for up to 8 hours on a single task, covering planning, execution, iteration, and delivery.

This is the entry that practitioners are most likely to reach for when picking a current GLM model. The earlier GLM-5 base release is referenced in prose rather than getting its own entry.

Capabilities

Z.ai’s release materials highlight a specific capability profile:

  • Strong autonomous coding behavior, including reported leadership on SWE-Bench Pro among open-source models at launch.
  • Long-horizon task execution with claimed sustained autonomy of up to 8 hours per task on coding workflows.
  • Improved tool calling and agent loop reliability over GLM-5.
  • Bilingual Chinese-English reasoning that remains a distinguishing strength of the GLM line.
  • Permissive MIT licensing on the open-weight release, enabling commercial use, fine-tuning, and redistribution without royalties.

Technical Details

Public anchors at this snapshot:

  • 744B total parameters, 44B active per token, Mixture-of-Experts with 256 experts (inherited from GLM-5).
  • 203K-token context window with up to 66K max output tokens.
  • Hosted API on Z.ai’s platform plus open weights on Hugging Face.
  • MIT license on the open-weight release.

GLM-5.1 is a post-training upgrade rather than a new base, so it shares GLM-5’s underlying architecture and most static specs.

Pricing & Access

Listed Z.ai API pricing (per 1M tokens):

  • Input: $1.40
  • Output: $4.40
  • Cache hit: $0.26 input

Z.ai raised its hosted pricing roughly 8% to 17% over the prior GLM-5 Turbo tier when 5.1 launched, so production cost models that referenced earlier GLM rates should be re-baselined.

Access options:

  • Z.ai (Zhipu AI) API
  • Open-weight downloads on Hugging Face under MIT
  • Third-party inference and gateway providers

Best Use Cases

Choose GLM-5.1 for:

  • Autonomous coding agents that need to run long, multi-step workflows without human-in-the-loop checkpoints.
  • Open-weight production deployments where MIT licensing simplifies legal review.
  • Bilingual Chinese-English assistant products where regional quality matters.
  • Tool-using agents and copilots at materially lower cost than US frontier alternatives.

For teams that need US-based enterprise contracts, established cloud distribution, or governance defaults closer to Anthropic and OpenAI, GLM remains an additional option rather than a baseline replacement.

Comparisons

  • GLM-5 (Z.ai): Same architecture; GLM-5.1 is the post-training upgrade with stronger coding and autonomous behavior, and is the more relevant entry for new work.
  • DeepSeek V4 (DeepSeek): Closest open-weight peer for long-context production routes; GLM-5.1 leans more on autonomous coding behavior.
  • Kimi K2.6 (Moonshot AI): Comparable Chinese open-weight flagship with explicit agent swarm features; GLM-5.1 leans on sustained single-task autonomy.