Claude Opus 4.8
Anthropic · Claude 4
Anthropic's May 2026 premium Claude model for long-horizon agentic coding, complex reasoning, and high-autonomy work.
Overview
Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last verified on May 29, 2026.
Claude Opus 4.8 is Anthropic’s current premium generally available Claude model, announced on May 28, 2026 as a direct upgrade over Opus 4.7. It keeps the same regular API price and the same broad Opus role: use it when the task is hard enough that quality, persistence, and review discipline matter more than default cost.
The practical upgrade is less about a new product category and more about agent behavior. Anthropic positions Opus 4.8 as stronger for long-horizon coding, tool use, computer/browser-style work, professional knowledge tasks, and workflows where the model must catch its own mistakes before handing work back. For prompts and use cases, prefer the stable claude-opus family overview unless the exact 4.8 API surface matters.
Capabilities
Anthropic’s launch materials and API docs frame Opus 4.8 around harder autonomous work: complex repository changes, long-running agent loops, multi-tool workflows, dense document analysis, and professional tasks where unsupported confidence is expensive. Compared with Opus 4.7, the public docs emphasize better tool triggering, better compaction recovery, more reliable behavior at each effort level, and fewer wasted thinking tokens when adaptive thinking is enabled.
The release also lands alongside new Claude product controls. Users can choose effort in claude.ai and Cowork, while Claude Code gains dynamic workflows for much larger engineering tasks. Those product features are not the same thing as the base model, but they are part of why Opus 4.8 matters in practice: the model is being routed into longer, more supervised agentic work rather than only single-turn chat.
Technical Details
Current official API details:
- API model ID:
claude-opus-4-8 - Context window: 1,000,000 tokens by default on the Claude API, Amazon Bedrock, and Vertex AI
- Microsoft Foundry context: 200,000 tokens
- Max output: 128,000 tokens
- Thinking: adaptive thinking, with
highas the default effort level - Prompt caching: 1,024-token minimum cacheable prompt length
- Mid-conversation system messages: supported in the Messages API
- Fast mode: research preview on the Claude API with up to 2.5x higher output-token speed at premium pricing
Like Opus 4.7, this model does not support non-default temperature, top_p, or top_k settings in the Messages API, and adaptive thinking is the supported thinking-on mode rather than explicit extended-thinking budgets.
Pricing & Access
Base API pricing (per 1M tokens):
- Input: $5
- Output: $25
- 5-minute cache writes: $6.25
- 1-hour cache writes: $10
- Cache hits and refreshes: $0.50
- Batch API input: $2.50
- Batch API output: $12.50
Fast mode for Opus 4.8 is priced at 50 per 1M output tokens. US-only inference through Anthropic’s documented data-residency option applies a 1.1x multiplier across token categories.
Access options include the Anthropic Claude API as claude-opus-4-8, Claude for Pro, Max, Team, and Enterprise users, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. Exact quotas and product-surface availability can still vary by plan, admin settings, and cloud provider.
Best Use Cases
Choose Opus 4.8 for high-value work where an error is costly and the model has enough context to make a better decision: multi-file refactors, migration planning, complex bug hunts, legal or financial document synthesis under human review, long-running research workflows, and agent systems that need to update instructions or budgets mid-task without destroying prompt-cache economics.
It is not the default for cheap high-volume extraction, routine drafting, or simple classification. For those, Claude Sonnet or Haiku routes are usually more economical. Opus 4.8 earns its keep when the work requires sustained judgment, tool discipline, and independent self-checking.
Comparisons
- Claude Opus 4.7 (Anthropic): Direct predecessor at the same regular API price. Opus 4.8 is the newer default for fresh premium Claude deployments.
- Claude Sonnet 4.6 (Anthropic): Better default for many production workloads where cost and latency matter more than maximum reasoning depth.
- GPT-5.5 (OpenAI): Strong OpenAI-side frontier alternative, especially for teams centered on ChatGPT, Codex, and OpenAI’s Responses API.
- Gemini 3.5 Flash (Google): Faster, cheaper Google-side route for multimodal and agentic workflows where Flash-level quality is enough.