DeepSeek V4
FamilyDeepSeek · DeepSeek V4
DeepSeek's V4 family spans Pro and Flash routes for million-token reasoning, coding, and low-cost agents.
Overview
Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last verified on May 16, 2026.
DeepSeek V4 is the V-series successor that DeepSeek began rolling out on April 24, 2026. It is the line that replaces older deepseek-chat and deepseek-reasoner aliases, both of which DeepSeek has scheduled for discontinuation on July 24, 2026. V4 is sold as DeepSeek’s main long-context production family rather than a research preview, and the API exposes two distinct variants under it: DeepSeek V4 Pro for harder reasoning workloads and DeepSeek V4 Flash for cheaper, throughput-oriented routes.
This entry covers the V4 generation as a whole. Reach for it when the product question is “should we adopt the new DeepSeek V4 line?” rather than detailed differences between Pro and Flash.
Capabilities
DeepSeek’s public materials emphasize a few characteristics that distinguish V4 from the earlier R1 line:
- Native 1M-token context window across both Pro and Flash variants, designed for long-context retrieval and analysis without separate “long-context” SKUs.
- Thinking mode is on by default, with a non-thinking mode available for latency-sensitive routes.
- Strong reasoning, math, and coding behavior, with V4-Pro positioned as competitive against current proprietary frontier reference points on standard benchmarks.
- Continued open-weight availability, retaining DeepSeek’s pattern of pairing a hosted API with releases on Hugging Face for self-hosted deployment.
Operationally, V4 sits alongside the DeepSeek-R1 family rather than on top of it. Teams already running R1 for reasoning-specific workloads can keep that route while migrating chat and assistant traffic to V4.
Technical Details
Official anchors at this snapshot:
- 1M token context window on both
deepseek-v4-flashanddeepseek-v4-pro. - 384K max output tokens.
- Two API model IDs:
deepseek-v4-flashanddeepseek-v4-pro. - Thinking mode by default, with a non-thinking mode toggle.
- OpenAI Chat Completions and Anthropic-compatible API formats.
Open-weight releases of the V4 family are published on DeepSeek’s Hugging Face organization under DeepSeek’s standard license, which keeps self-hosted, private, and air-gapped deployments viable for teams that need them.
Pricing & Access
Listed API pricing (per 1M tokens):
deepseek-v4-flashinput cache miss: $0.14deepseek-v4-flashinput cache hit: $0.0028deepseek-v4-flashoutput: $0.28deepseek-v4-proinput cache miss: $1.74deepseek-v4-proinput cache hit: $0.0145deepseek-v4-prooutput: $3.48
DeepSeek currently runs a 75% launch discount on deepseek-v4-pro through May 31, 2026 at 15:59 UTC. During the promo window, effective cache-miss input drops to roughly $0.435 per million tokens, and promo output drops to roughly $0.87 per million tokens. Production planning should still use the regular rates above for cost models that extend past the discount window.
Access options:
- DeepSeek API (
deepseek-v4-flash,deepseek-v4-pro) - Open-weight downloads on Hugging Face
- Third-party inference hosts and gateways that support DeepSeek models
Best Use Cases
Choose DeepSeek V4 for:
- Long-context retrieval, summarization, and analysis where a 1M-token window simplifies the retrieval architecture.
- Cost-sensitive production assistants that still need solid reasoning behavior.
- Open-weight deployments where regulatory, privacy, or sovereignty requirements rule out US-hosted closed APIs.
- Teams already using earlier DeepSeek lines that need a planned migration target before
deepseek-chatanddeepseek-reasonerare deprecated in July 2026.
V4 is less of a fit when frontier-only intelligence is the primary requirement, when official enterprise support contracts and data residency commitments are mandatory, or when the workflow is built around tool-use ecosystems specific to OpenAI, Anthropic, or Google.
Comparisons
- DeepSeek-R1 (DeepSeek): Reasoning-specialized line that remains available alongside V4 for analytical workloads.
- Claude Opus 4.7 (Anthropic): Premium frontier alternative with stronger enterprise distribution and governance, at materially higher cost.
- GPT-5.5 (OpenAI): Closed-source generalist flagship with broader product surfaces, while V4 leads on price and self-hostable open weights.
Model Versions
DeepSeek V4 Flash
currentDeepSeek's low-cost V4 API route for 1M-context production assistants, agents, and compatibility migrations.
DeepSeek V4 Pro
currentDeepSeek's stronger V4 route for million-token reasoning, agentic coding, and higher-end open-weight workloads.