Kimi K2.5

Moonshot AI · Kimi

Moonshot's Kimi K2.5 is an open-weight long-context model focused on agentic reasoning and tool use.

Part of Kimi family
Type
language
Context
256K tokens
Max Output
96K tokens
Status
current
API Access
Yes
License
Modified MIT
chinese reasoning agentic tool-use open-weights long-context
Released February 2026 · Updated March 1, 2026

Overview

Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last verified on March 1, 2026.

Kimi K2.5 is Moonshot AI’s latest open-weight Kimi release and an important step beyond the earlier K2 line for agent-style tasks. The model is aimed at long-context reasoning, coding, and tool-driven orchestration where reliability across multi-step tasks matters more than single-turn fluency.

Moonshot positions K2.5 as a production-capable open alternative for teams that want to tune inference and deployment choices directly instead of being locked into a single hosted endpoint.

Capabilities

Kimi K2.5 is strongest in long-context reasoning, coding support, and function-calling style workflows. Official benchmark disclosures also highlight gains on software engineering and math-heavy evaluations compared with prior Kimi models.

Its multilingual behavior is tuned for Chinese and English-heavy workflows, which makes it a practical option for teams serving both domestic Chinese and international users in one stack.

Technical Details

Moonshot’s official model card states a 256K context window. For long-output evaluation settings, Moonshot uses a 96K completion budget in benchmark methodology, which is the basis for the maxOutput snapshot value here.

The model is released as open weights under a Modified MIT license, and Moonshot provides integration guidance for common serving frameworks and acceleration paths.

Pricing & Access

Kimi K2.5 is available through official Moonshot model release channels and Kimi/Moonshot platform integrations. Moonshot has published product-level pricing updates, but token-level K2.5 pricing can vary by endpoint and region, so confirm current rates directly in Moonshot platform docs before committing budgets.

Best Use Cases

Use Kimi K2.5 when you need an open model for long-context agent pipelines, code-heavy assistant workflows, or retrieval-augmented systems that require large evidence windows.

It is also a good candidate when you want strong Chinese-English quality without giving up open deployment control.

Comparisons

  • GLM-5 (Zhipu AI): GLM-5 has a higher published output ceiling in hosted form; Kimi K2.5 is attractive when open-weight control and Moonshot’s agentic tuning are priority.
  • Qwen3.5 (Alibaba): Qwen3.5 offers a broad open+hosted ecosystem; Kimi K2.5 currently emphasizes long-context agent behavior and coding-heavy benchmarks.
  • DeepSeek-R1 (DeepSeek): DeepSeek is often favored for cost-efficient hosted reasoning; Kimi K2.5 is better suited when self-hosted open-weight deployment is central.