Models
Technical model profiles and strategy explainers — capabilities, deployment tradeoffs, and practical fit guidance.
AI model pages are point-in-time snapshots based on each page's last verified date. Current and preview entries are refreshed on the active maintenance cadence, while legacy and deprecated entries remain browseable as historical context.
Filters available
Filter by type, provider, status, and open-source availability. Deprecated entries stay hidden unless enabled.
Provider
Status
Model Strategy Explainers
Constraint-led guidance for open-weight and proprietary choices across local, private, and managed deployment paths.
Open-Weight vs Proprietary Models
open vs proprietary
How to choose between open-weight and proprietary models in 2026 using workload tiers, routing policy, and operating-capacity reality instead of ideology.
local device · private data center · rented data center
Running Open-Weight Models on Personal Devices
local device
A 2026 decision framework for when laptop or workstation inference is genuinely useful, when it is not, and how to pair local models with cloud escalation.
local device · offline · hybrid routing
Managed Open-Weight Models vs Self-Hosting
managed open hosting
A practical framework for deciding when open-weight models should be consumed through managed hosting and when full self-hosting is worth the operational burden.
managed open hosting · rented data center · private data center
Hosting Open-Weight Models: Private vs Rented Data Centers
data center hosting
How to decide between owned infrastructure, rented GPU capacity, or a hybrid model when open-weight workloads move past the workstation phase.
private data center · rented data center · hybrid routing
Using Proprietary Models in EU and Nordics
proprietary usage
How to use proprietary models in EU and Nordic environments without pretending that access, residency, and governance are the same thing.
managed api · eu region hosting · hybrid routing
Hybrid Model Routing Across Local, Private, and Managed
hybrid routing
How to design a policy-driven multi-model system that routes between local, private, and managed models without turning routing into hidden chaos.
local device · eu region hosting · private data center
Image and Video Model Selection: API Productization vs Creator Tools
media model selection
A 2026 framework for deciding when media generation belongs in an API product lane, a creator-tool lane, or a deliberate two-lane hybrid.
managed api · creator tooling · hybrid routing
Model Families
Stable overviews of major model product lines. Use these as durable reference points.
Claude Haiku
Anthropic
Anthropic's fastest Claude line for latency-sensitive, high-volume, and cost-constrained workloads.
Claude Opus
Anthropic
Anthropic's premium Claude line for frontier coding, complex reasoning, and long-horizon agent workflows.
Claude Sonnet
Anthropic
Anthropic's balanced Claude line for most production coding, reasoning, and assistant workloads.
DeepSeek V4
DeepSeek
DeepSeek's V4 family spans Pro and Flash routes for million-token reasoning, coding, and low-cost agents.
DeepSeek-R1
DeepSeek
DeepSeek's open-weight reasoning family remains relevant, while API planning has shifted toward DeepSeek V4.
Gemini Flash
Google's fast Gemini line, now led by Gemini 3.5 Flash for stable high-speed multimodal and agentic workloads.
Gemini Pro
Google's high-capability Gemini line for long-context multimodal reasoning, coding, and advanced enterprise workflows.
Gemma 4
Google's Apache 2.0 open model family for local agentic, multimodal, coding, and edge workflows.
GLM
Zhipu AI
Z.ai's GLM family spans open and hosted models for Chinese-first reasoning, coding, and autonomous agent workflows.
GPT
OpenAI
OpenAI's GPT family across frontier GPT-5.5, current API flagships, Codex-specialized routes, and newer domain-specific research lanes.
GPT Image
OpenAI
OpenAI's image-generation family for ChatGPT Images 2.0, API-first visual creation, and iterative editing workflows.
Grok
xAI
xAI's Grok line now centers new API work on Grok 4.3, with older Grok 4, 4.1 Fast, and Code Fast slugs redirected after retirement.
Grok Imagine
xAI
xAI's Grok Imagine family for standard image, quality image, and video generation in Grok-centric production workflows.
Imagen
Google's Imagen family for production-oriented image generation in Gemini API and creative workflows.
Kimi
Moonshot AI
Moonshot's Kimi family focuses on long-context reasoning and agentic behavior with open-weight options.
Llama
Meta
Meta's open-weight Llama family for self-hosting, fine-tuning, and privacy-conscious multimodal AI deployments.
MiniMax M
MiniMax
MiniMax's M family is a proprietary API line for coding, agentic workflows, and high-value productivity tasks.
Mistral Large
Mistral AI
Mistral's flagship open-weight family for long-context multimodal assistants and tool-driven enterprise workloads.
Qwen3
Alibaba
Alibaba's Qwen3 family combines hybrid-thinking open-weight models with high-end hosted tiers, now extended into the Qwen 3.6 generation.
Sora
OpenAI
OpenAI's Sora family for high-fidelity video generation across creative tooling and API model surfaces.
Veo
Google's Veo video generation family for cinematic text/image-to-video workflows and API-backed production pipelines.
Language Models
DeepSeek V4 Flash
DeepSeek · DeepSeek V4
DeepSeek's low-cost V4 API route for 1M-context production assistants, agents, and compatibility migrations.
DeepSeek V4 Pro
DeepSeek · DeepSeek V4
DeepSeek's stronger V4 route for million-token reasoning, agentic coding, and higher-end open-weight workloads.
Devstral 2
Mistral AI · Devstral
Mistral's open-weight coding model built for agentic software work and terminal-first execution.
GLM-5
Zhipu AI · GLM
Zhipu's GLM long-context model with strong coding ability and open-weight plus API access.
GLM-5.1
Z.ai · GLM
Z.ai's GLM-5.1, a 744B-parameter MoE open-weight model with strong autonomous coding and tool-use behavior.
GPT-5
OpenAI · GPT-5
Original GPT-5 release entry, now superseded by newer GPT-5.3 and GPT-5.4 generation variants.
GPT-5 mini
OpenAI · GPT-5
Cost-efficient GPT-5 variant for high-volume production workflows needing strong reasoning at lower cost.
GPT-5 nano
OpenAI · GPT-5
Ultra-low-cost GPT-5 tier for high-throughput automation and lightweight reasoning tasks.
GPT-5-Codex
OpenAI · GPT-5
Earlier GPT-5 Codex release entry kept as a historical baseline in OpenAI's Codex model lineage.
GPT-5.2
OpenAI · GPT-5
Current GPT-5 family flagship in OpenAI's API guide for coding, agentic, and general professional work.
GPT-5.2-Codex
OpenAI · GPT-5
Current GPT-5.2 coding model for long-horizon software engineering and agentic repository work.
GPT-5.2-Pro
OpenAI · GPT-5
Current premium GPT-5.2 tier for higher-precision API work when standard GPT-5.2 is not enough.
GPT-5.3
OpenAI · GPT-5
Current GPT-5.3 Instant / Chat route for everyday ChatGPT work and API chat-style testing.
GPT-5.3-Codex
OpenAI · GPT-5
Specialized GPT-5.3 Codex model for long-horizon agentic software engineering.
GPT-5.4
OpenAI · GPT-5
API-ready GPT-5 premium model for difficult professional work, tool use, and computer-assisted tasks.
GPT-5.4 mini
OpenAI · GPT-5
OpenAI's strongest mini model for coding, computer use, and fast high-volume agent workloads.
GPT-5.4 nano
OpenAI · GPT-5
OpenAI's cheapest GPT-5.4 route for fast classification, extraction, and lightweight coding subagents.
GPT-5.4-Pro
OpenAI · GPT-5
API-ready premium GPT-5 escalation tier for decision-ready analysis and demanding professional workflows.
GPT-5.5
OpenAI · GPT-5
OpenAI's newest GPT-5.5 route for agentic coding, professional work, research, and computer-use tasks.
GPT-5.5 Pro
OpenAI · GPT-5
Premium GPT-5.5 route for the hardest ChatGPT reasoning, research, and professional workflows.
GPT-Rosalind
OpenAI · GPT
OpenAI's life-sciences research preview model for biology, drug discovery, and tool-heavy scientific workflows.
Grok 4.20
xAI · Grok
xAI's 2M-context preview lane for enterprise research and multi-agent experiments, while Grok 4.3 remains the default API caller.
Kimi K2.5
Moonshot AI · Kimi
Moonshot's Kimi K2.5 is an open-weight long-context model focused on agentic reasoning and tool use.
MiniMax M2.5
MiniMax · MiniMax M
MiniMax's M2.5, a fast and inexpensive proprietary model for agentic coding, tool use, and high-volume production.
MiniMax M2.7
MiniMax · MiniMax M
MiniMax's M2.7 agentic productivity model for coding, office workflows, tool use, and low-cost long-context execution.
MiniMax-M2.5
MiniMax · MiniMax M
MiniMax's still-active M-series model for coding, tool use, and office-style agent workflows.
Mistral Small 3.2
Mistral AI · Mistral Small
Mistral's open 24B model balancing strong instruction quality with low API cost for production assistants.
Mistral Small 4
Mistral AI · Mistral Small
Mistral's new open-weight small model for efficient long-context assistants and coding support.
o3
OpenAI · o-series
Still-available OpenAI reasoning model for difficult analysis, retained as a legacy reference after GPT-5 became the default recommendation.
o3-deep-research
OpenAI · o-series
OpenAI's highest-capability deep research model for long, source-heavy investigations over web and private data.
o4-mini
OpenAI · o-series
Cost-efficient OpenAI reasoning model retained as a legacy API reference after GPT-5 mini became the newer default direction.
o4-mini-deep-research
OpenAI · o-series
Lower-cost OpenAI deep research model for source-heavy investigations when throughput and budget matter.
OpenAI Privacy Filter
OpenAI · OpenAI Privacy Filter
OpenAI's Apache 2.0 open-weight model for local PII detection and redaction workflows.
Qwen 3.6 Max Preview
Alibaba · Qwen3
Alibaba's flagship Qwen 3.6 Max Preview, a sparse MoE model for agentic coding, tool use, and long-context reasoning.
Qwen3-Max
Alibaba · Qwen3
Alibaba's top Qwen API model for high-end multilingual reasoning, coding, and enterprise assistant workloads.
Qwen3.5
Alibaba · Qwen
Alibaba's Qwen3.5 generation extends the Qwen line with stronger open-weight reasoning and coding performance.
Multimodal Models
Claude Haiku 4.5
Anthropic · Claude 4
Fast and efficient Claude tier for latency-sensitive assistant and automation workloads.
Claude Mythos Preview
Anthropic · Claude Mythos
Anthropic's gated research-preview frontier model for defensive cybersecurity, autonomous coding, and long-running agents.
Claude Opus 4.6
Anthropic · Claude 4
Superseded Claude 4.6 snapshot for high-difficulty reasoning, coding, and long-running agent workflows.
Claude Opus 4.7
Anthropic · Claude 4
Anthropic's April 2026 premium Claude model, now superseded by Opus 4.8 but still useful for pinned deployments.
Claude Opus 4.8
Anthropic · Claude 4
Anthropic's May 2026 premium Claude model for long-horizon agentic coding, complex reasoning, and high-autonomy work.
Claude Sonnet 4.5
Anthropic · Claude 4
Balanced Claude tier for production reasoning, coding, and long-context assistant workflows.
Claude Sonnet 4.6
Anthropic · Claude 4
Anthropic's balanced Claude model — strong reasoning and coding at moderate pricing, the default recommendation for most tasks.
Computer Use Preview
Google · Gemini Computer Use
Google's preview computer-use model surface for browser and interface control workflows.
Gemini 2.5 Flash
Google · Gemini 2.5
Stable Gemini 2.5 Flash route balancing multimodal capability, latency, and production cost.
Gemini 2.5 Flash Live Preview
Google · Gemini 2.5
Google's stable 2.5-era native-audio Live API model for realtime multimodal voice agents.
Gemini 2.5 Flash-Lite
Google · Gemini 2.5
Stable budget Gemini 2.5 tier for large-scale assistant and automation workloads.
Gemini 2.5 Pro
Google · Gemini 2.5
Stable high-capability Gemini 2.5 tier for long-context multimodal reasoning and enterprise workflows.
Gemini 3 Flash
Google · Gemini 3
Google's older Gemini 3 preview Flash route, now superseded by stable Gemini 3.5 Flash for most new fast-model work.
Gemini 3.1 Flash Live Preview
Google · Gemini 3.1
Google's low-latency Gemini 3.1 live model for realtime audio-to-audio and multimodal dialogue.
Gemini 3.1 Flash-Lite
Google · Gemini 3.1
Google's newer low-cost Gemini preview tier for high-throughput multimodal assistant and automation workloads.
Gemini 3.1 Pro Preview
Google · Gemini 3.1
Google's current premium Gemini 3.1 preview model for multimodal reasoning, coding, long-context analysis, and agentic workflows.
Gemini 3.5 Flash
Google · Gemini 3.5
Google's stable Gemini 3.5 Flash model for fast frontier multimodal, coding, and long-horizon agent workflows.
Gemini Robotics-ER 1.6
Google · Gemini Robotics
Google DeepMind's robotics-tuned Gemini for embodied reasoning, spatial planning, and physical agent tasks.
GPT-4.1
OpenAI · GPT-4.1
Long-context multimodal model retained as a legacy reference after retirement from ChatGPT defaults.
GPT-4o
OpenAI · GPT-4o
Widely deployed multimodal model kept as a legacy reference after retirement from ChatGPT defaults.
GPT-4o mini
OpenAI · GPT-4o
Lower-cost GPT-4o API tier for high-volume text-plus-image assistant and automation workloads.
Grok 4.3
xAI · Grok
xAI's recommended primary Grok caller and post-retirement redirect target for reasoning, non-reasoning, coding, and long-context agent work.
Kimi K2.6
Moonshot AI · Kimi
Moonshot AI's Kimi K2.6, a 1T-parameter MoE open-weight model for long-horizon coding and agentic workflows.
Llama 4 Maverick
Meta · Llama 4
Meta's larger open-weight Llama 4 MoE model for multimodal assistants and controlled deployments.
Llama 4 Scout
Meta · Llama 4
Meta's efficiency-focused Llama 4 MoE model with a headline 10M-token context window.
Mistral Large 3
Mistral AI · Mistral Large
Mistral's flagship open-weight European multimodal model with long context and competitive enterprise API economics.
Mistral Medium 3.5
Mistral AI · Mistral Medium
Mistral's dense 128B Medium 3.5, a frontier-class multimodal model unifying chat, reasoning, and coding behavior.
Muse Spark
Meta · Muse
Meta's first Muse-family model for Meta AI, combining multimodal reasoning, tool use, and parallel-agent test-time thinking.
Qwen3.6-27B
Alibaba · Qwen3
Alibaba's open-weight 27B Qwen 3.6 model for agentic coding, vision-language work, and self-hosted deployment.
Image Models
GPT Image 1.5
OpenAI · GPT Image
OpenAI's previous premium GPT Image tier for higher-fidelity generation and iterative editing workflows.
GPT Image 2
OpenAI · GPT Image
OpenAI's current state-of-the-art GPT Image model behind ChatGPT Images 2.0 and API image generation.
GPT-Image-1
OpenAI · GPT Image
OpenAI image generation model for prompt-driven creation and iterative editing workflows.
gpt-image-1-mini
OpenAI · GPT Image
Lower-cost GPT Image tier for product teams that need image generation at higher volume.
grok-imagine-image
xAI · Grok Imagine
xAI's standard Grok Imagine image model for API image generation and editing workflows.
grok-imagine-image-1212
xAI · Grok Imagine
Older Grok Imagine image-generation model ID retained as a legacy reference for pre-refresh integrations.
grok-imagine-image-quality
xAI · Grok Imagine
xAI's Grok Imagine Quality Mode image model for higher-realism, stronger text rendering, and brand-controlled visuals.
Imagen 4
Google · Imagen
Google's current Imagen 4 image generation tier for API-backed visual creation and design workflows.
Imagen 4 Fast
Google · Imagen
Google's lower-latency Imagen 4 tier for faster image generation in Gemini API workflows.
Nano Banana 2
Google · Nano Banana
Google's latest fast image-generation release combining higher fidelity, stronger reasoning, and Flash-speed iteration.
Nano Banana Pro
Google · Nano Banana
Google's Gemini 3 Pro Image preview model for professional visual assets, grounded image generation, and high-fidelity text rendering.
Video Models
Gemini Omni Flash
Google · Gemini Omni
Google's first Gemini Omni model for multimodal video creation and conversational video editing.
grok-imagine-video
xAI · Grok Imagine
xAI's Grok Imagine video model for text-to-video, image-to-video, video editing, and short creative clips.
grok-imagine-video-1212
xAI · Grok Imagine
Older Grok Imagine video-generation model ID retained as a legacy reference for pre-refresh integrations.
Sora 2
OpenAI · Sora
OpenAI's current Sora generation for cinematic text/image-to-video creation in product and API workflows.
Sora 2 Pro
OpenAI · Sora
OpenAI's premium Sora tier for higher-fidelity synced-audio video generation and tougher creative shots.
Veo 3
Google · Veo
Google's current Veo video generation model used in Flow and related creative AI workflows.
Veo 3.1
Google · Veo
Google's latest Veo preview tier for higher-end video generation with native audio and stronger reference control.
Audio Models
Eleven v3
ElevenLabs · Eleven
ElevenLabs' generally available expressive text-to-speech model for premium voice and dialogue output.
Gemini 2.5 Pro TTS Preview
Google · Gemini 2.5
Google's 2.5 Pro TTS preview tier for natural, steerable one-way speech generation.
Gemini 3.1 Flash TTS Preview
Google · Gemini 3.1
Google's expressive Gemini 3.1 Flash TTS preview, an API-first text-to-speech model spanning 70+ languages.
GPT-4o mini Transcribe
OpenAI · GPT-4o Audio
Lower-cost OpenAI speech-to-text tier for high-volume transcription pipelines.
GPT-4o mini TTS
OpenAI · GPT-4o Audio
OpenAI text-to-speech model for responsive, API-first voice output workflows.
GPT-4o Transcribe
OpenAI · GPT-4o Audio
OpenAI speech-to-text model tier for production transcription and voice pipeline workflows.
GPT-realtime-1.5
OpenAI · GPT Realtime
OpenAI's earlier realtime voice model for audio-in, audio-out agents, now superseded by GPT-Realtime-2 for new flagship voice work.
GPT-Realtime-2
OpenAI · GPT Realtime
OpenAI's GPT-5-class realtime voice model for reasoning, tool-using speech agents, and live support workflows.
GPT-Realtime-Translate
OpenAI · GPT Realtime
OpenAI's realtime speech-to-speech translation model for live multilingual audio experiences.
GPT-Realtime-Whisper
OpenAI · GPT Realtime
OpenAI's streaming speech-to-text model for low-latency realtime transcription.
Lyria 2
Google · Lyria
Earlier Google music generation route retained as reference while newer Lyria 3 surfaces take the spotlight.