Claude Opus 4.8 — Signal Lens

Overview

Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last fully verified on June 8, 2026; comparison and routing references were refreshed on July 6, 2026.

Claude Opus 4.8 is Anthropic’s current premium generally available Claude model, announced on May 28, 2026 as a direct upgrade over Opus 4.7. It keeps the same regular API price and the same broad Opus role: use it when the task is hard enough that quality, persistence, and review discipline matter more than default cost.

The practical upgrade is less about a new product category and more about agent behavior. Anthropic positions Opus 4.8 as stronger for long-horizon coding, tool use, computer/browser-style work, professional knowledge tasks, and workflows where the model must catch its own mistakes before handing work back. For prompts and use cases, prefer the stable claude-opus family overview unless the exact 4.8 API surface matters.

Capabilities

Anthropic’s launch materials and API docs frame Opus 4.8 around harder autonomous work: complex repository changes, long-running agent loops, multi-tool workflows, dense document analysis, and professional tasks where unsupported confidence is expensive. Compared with Opus 4.7, the public docs emphasize better tool triggering, better compaction recovery, more reliable behavior at each effort level, and fewer wasted thinking tokens when adaptive thinking is enabled.

The release also lands alongside new Claude product controls. Users can choose effort in claude.ai and Cowork, while Claude Code gains dynamic workflows for much larger engineering tasks. Those product features are not the same thing as the base model, but they are part of why Opus 4.8 matters in practice: the model is being routed into longer, more supervised agentic work rather than only single-turn chat.

Technical Details

Current official API details:

API model ID: claude-opus-4-8
Context window: 1,000,000 tokens by default on the Claude API, Amazon Bedrock, and Vertex AI
Microsoft Foundry context: 200,000 tokens
Max output: 128,000 tokens in the synchronous Messages API; Message Batches can reach 300K output tokens with Anthropic’s documented output-300k-2026-03-24 beta header
Thinking: adaptive thinking, with high as the default effort level
Prompt caching: 1,024-token minimum cacheable prompt length
Mid-conversation system messages: supported in the Messages API
Fast mode: research preview on the Claude API with up to 2.5x higher output-token speed at premium pricing

Like Opus 4.7, this model does not support non-default temperature, top_p, or top_k settings in the Messages API, and adaptive thinking is the supported thinking-on mode rather than explicit extended-thinking budgets.

Anthropic’s model lifecycle page lists claude-opus-4-8 as active with no tentative retirement sooner than May 28, 2027.

Pricing & Access

Base API pricing (per 1M tokens):

Input: $5
Output: $25
5-minute cache writes: $6.25
1-hour cache writes: $10
Cache hits and refreshes: $0.50
Batch API input: $2.50
Batch API output: $12.50

Fast mode for Opus 4.8 is priced at $10 per 1M input tokens and$ 50 per 1M output tokens. US-only inference through Anthropic’s documented data-residency option applies a 1.1x multiplier across token categories.

Access options include the Anthropic Claude API as claude-opus-4-8, Claude for Pro, Max, Team, and Enterprise users, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. Exact quotas and product-surface availability can still vary by plan, admin settings, and cloud provider.

Best Use Cases

Choose Opus 4.8 for high-value work where an error is costly and the model has enough context to make a better decision: multi-file refactors, migration planning, complex bug hunts, legal or financial document synthesis under human review, long-running research workflows, and agent systems that need to update instructions or budgets mid-task without destroying prompt-cache economics.

It is not the default for cheap high-volume extraction, routine drafting, or simple classification. For those, Claude Sonnet or Haiku routes are usually more economical. Opus 4.8 earns its keep when the work requires sustained judgment, tool discipline, and independent self-checking.

Comparisons

Claude Opus 4.7 (Anthropic): Direct predecessor at the same regular API price. Opus 4.8 is the newer default for fresh premium Claude deployments.
Claude Sonnet 5 (Anthropic): Better default for many production workloads where cost and latency matter more than maximum reasoning depth.
GPT-5.5 (OpenAI): Strong OpenAI-side frontier alternative, especially for teams centered on ChatGPT, Codex, and OpenAI’s Responses API.
Gemini 3.5 Flash (Google): Faster, cheaper Google-side route for multimodal and agentic workflows where Flash-level quality is enough.