Mistral Medium 3.5

Mistral AI · Mistral Medium

Mistral's dense 128B Medium 3.5, a frontier-class multimodal model unifying chat, reasoning, and coding behavior.

Type
multimodal
Context
256K tokens
Max Output
33K tokens
Status
current
Input
$1.5/1M tok
Output
$7.5/1M tok
API Access
Yes
License
Modified MIT
reasoning coding agentic vision tool-use long-context open-weights
Released April 2026 · Updated May 1, 2026

Overview

Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last verified on May 1, 2026.

Mistral Medium 3.5 is Mistral AI’s April 28, 2026 frontier-class multimodal model. It continues the Mistral Small 4 design philosophy of consolidating chat, reasoning, vision, and coding into a single dense model rather than shipping separate Magistral, Pixtral, and Devstral variants. The model exposes a reasoning_effort parameter so callers can dial behavior between fast instruct responses and deep step-by-step reasoning per request, and Mistral ships open weights on Hugging Face under a Modified MIT license alongside hosted API access.

This entry sits above Mistral Small 4 in capability and below Mistral Large 3 in scale and price. For practitioners, the practical question is usually whether Medium 3.5 can replace a more expensive frontier model on agentic and coding workloads, and Mistral’s launch materials lean hard into that framing.

Capabilities

Mistral’s release materials and benchmark publications highlight a specific capability profile:

  • Strong agentic coding behavior, with reported 77.6% on SWE-Bench Verified at launch.
  • High accuracy on practical agent benchmarks such as 91.4% on τ³-Telecom.
  • Configurable reasoning effort per request through the reasoning_effort parameter.
  • Multimodal input including images, inheriting Pixtral-line vision capabilities into the unified model.
  • Self-hostable on roughly 4 GPUs at the published 128B dense size, keeping private deployments practical.

The unified design means prompt and routing logic that previously juggled separate Magistral, Pixtral, and Devstral endpoints can collapse onto one model surface.

Technical Details

Public anchors at this snapshot:

  • Dense 128B parameter model (not MoE) at the open-weight release size.
  • 256K-token context window.
  • Multimodal input with text and image; text output.
  • Open weights on Hugging Face under a Modified MIT license, with mistralai/Mistral-Medium-3.5-128B as the headline release.
  • Hosted API access through Mistral’s platform, Le Chat, and the Mistral Vibe agentic coding product.

Pricing & Access

Listed Mistral API pricing (per 1M tokens):

  • Input: $1.5
  • Output: $7.5

Access options:

  • Mistral platform API
  • Le Chat (Pro, Team, Enterprise plans, with Work mode powered by Medium 3.5)
  • Mistral Vibe remote-agent product
  • Open-weight self-hosting on Hugging Face under Modified MIT

Best Use Cases

Choose Mistral Medium 3.5 for:

  • Agentic coding workflows that benefit from configurable reasoning effort and a 256K context window.
  • Multimodal assistants that need vision plus reasoning in a single model surface.
  • European and EU-data-residency deployments where Mistral’s posture is a fit.
  • Self-hosted production at frontier-tier capability without paying US closed-API rates.
  • Consolidation work that replaces Magistral, Pixtral, and Devstral pipelines with one endpoint.

For maximum-frontier reasoning peaks or the broadest enterprise platform availability, Anthropic and OpenAI premium tiers remain easier defaults. For the cheapest Mistral route, Mistral Small 4 is still the better entry point.

Comparisons

  • Mistral Small 4 (Mistral AI): Cheaper unified Mistral route at smaller scale; Medium 3.5 trades cost for stronger frontier behavior.
  • Claude Sonnet 4.6 (Anthropic): Comparable mid-frontier alternative; Medium 3.5 leads on European positioning and open-weight self-hosting.
  • GPT-5.4 (OpenAI): Premium frontier closed-source option with stronger product surfaces; Medium 3.5 differentiates through openness and configurable reasoning effort.