GLM-5.1
Z.ai · GLM
Z.ai's GLM-5.1, a 744B-parameter MoE open-weight model with strong autonomous coding and tool-use behavior.
Overview
Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last verified on May 1, 2026.
GLM-5.1 is Z.ai’s post-training upgrade to GLM-5, the frontier model the company shipped on February 11, 2026. The 5.1 variant became available through Z.ai’s API on March 27, 2026, and was released as open-source weights on April 7, 2026. It keeps the GLM-5 base architecture (a 744B-parameter Mixture-of-Experts with 44B active per token) while substantially improving coding, tool use, and autonomous execution behavior. Z.ai claims the model can sustain coherent autonomous work for up to 8 hours on a single task, covering planning, execution, iteration, and delivery.
This is the entry that practitioners are most likely to reach for when picking a current GLM model. The earlier GLM-5 base release is referenced in prose rather than getting its own entry.
Capabilities
Z.ai’s release materials highlight a specific capability profile:
- Strong autonomous coding behavior, including reported leadership on SWE-Bench Pro among open-source models at launch.
- Long-horizon task execution with claimed sustained autonomy of up to 8 hours per task on coding workflows.
- Improved tool calling and agent loop reliability over GLM-5.
- Bilingual Chinese-English reasoning that remains a distinguishing strength of the GLM line.
- Permissive MIT licensing on the open-weight release, enabling commercial use, fine-tuning, and redistribution without royalties.
Technical Details
Public anchors at this snapshot:
- 744B total parameters, 44B active per token, Mixture-of-Experts with 256 experts (inherited from GLM-5).
- 203K-token context window with up to 66K max output tokens.
- Hosted API on Z.ai’s platform plus open weights on Hugging Face.
- MIT license on the open-weight release.
GLM-5.1 is a post-training upgrade rather than a new base, so it shares GLM-5’s underlying architecture and most static specs.
Pricing & Access
Listed Z.ai API pricing (per 1M tokens):
- Input: $1.40
- Output: $4.40
- Cache hit: $0.26 input
Z.ai raised its hosted pricing roughly 8% to 17% over the prior GLM-5 Turbo tier when 5.1 launched, so production cost models that referenced earlier GLM rates should be re-baselined.
Access options:
- Z.ai (Zhipu AI) API
- Open-weight downloads on Hugging Face under MIT
- Third-party inference and gateway providers
Best Use Cases
Choose GLM-5.1 for:
- Autonomous coding agents that need to run long, multi-step workflows without human-in-the-loop checkpoints.
- Open-weight production deployments where MIT licensing simplifies legal review.
- Bilingual Chinese-English assistant products where regional quality matters.
- Tool-using agents and copilots at materially lower cost than US frontier alternatives.
For teams that need US-based enterprise contracts, established cloud distribution, or governance defaults closer to Anthropic and OpenAI, GLM remains an additional option rather than a baseline replacement.
Comparisons
- GLM-5 (Z.ai): Same architecture; GLM-5.1 is the post-training upgrade with stronger coding and autonomous behavior, and is the more relevant entry for new work.
- DeepSeek V4 (DeepSeek): Closest open-weight peer for long-context production routes; GLM-5.1 leans more on autonomous coding behavior.
- Kimi K2.6 (Moonshot AI): Comparable Chinese open-weight flagship with explicit agent swarm features; GLM-5.1 leans on sustained single-task autonomy.