Gemini 2.5 Flash Live Preview

Overview

Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last verified on May 16, 2026.

Gemini 2.5 Flash Live Preview is Google’s native-audio Live API route for realtime voice and multimodal agent experiences. It remains listed in the current docs alongside the newer Gemini 3.1 Flash Live Preview, which means teams should evaluate it as the stable 2.5-era live route rather than the newest Google voice model.

Capabilities

This model is built for conversational interfaces where low-latency turn-taking, more natural voice behavior, and multimodal context all matter together. Google’s pricing docs emphasize higher-quality pacing, voice naturalness, verbosity control, and mood, which makes it more relevant for voice agents and guided realtime experiences than a plain text-first Flash route.

It also supports function calling and search grounding, which means the Live API can act more like an operational voice agent and less like a read-only speech demo.

Technical Details

Google’s current model docs still list Gemini 2.5 Flash Live with:

Model code: gemini-2.5-flash-native-audio-preview-12-2025
Input token limit: 131,072
Output token limit: 8,192
Inputs: audio, video, and text
Outputs: audio and text

The same docs also show the current preview replacing earlier live model IDs, which is a useful signal that Google is consolidating around this native-audio route rather than older live variants.

Pricing & Access

Google’s current pricing docs list paid-tier pricing at:

Input: $0.50 per 1M text tokens
Input: $3.00 per 1M audio or video tokens
Output: $2.00 per 1M text tokens
Output: $12.00 per 1M audio tokens

Signal Lens stores the text input and text output prices in frontmatter for baseline comparability, but real deployment cost will depend heavily on audio traffic. Availability is through the Gemini Live API preview surface.

Best Use Cases

Use Gemini 2.5 Flash Live Preview for realtime tutoring, guided product walkthroughs, conversational support flows, or voice-first assistants that need to react to speech and visual context together. Test Gemini 3.1 Flash Live when the newest preview model matters more than compatibility.

Comparisons

Gemini 2.5 Pro TTS Preview (Google): Better for one-way high-quality speech generation, while Flash Live is built for two-way realtime interaction.
GPT Realtime-style flows: Similar broad category, with platform choice usually driven by stack alignment and tooling preferences.
ElevenLabs conversational agents: Stronger productized voice platform, while Gemini Flash Live is the model-layer route inside Google’s ecosystem.