Gemini Omni Flash
Google · Gemini Omni
Google's first Gemini Omni model for multimodal video creation and conversational video editing.
Overview
Freshness note: Video-model capabilities, rollout timing, and pricing can change quickly. This profile is a point-in-time snapshot last verified on May 24, 2026.
Gemini Omni Flash is the first model in Google’s Gemini Omni family. Google introduced it at I/O 2026 as a model that can create video from mixed inputs: text, images, video, and audio. The main product idea is to combine Gemini reasoning with creative video generation and editing, so users can start from existing references and refine the output through conversation.
At launch, Gemini Omni Flash is rolling out through the Gemini app, Google Flow, YouTube Shorts, and the YouTube Create app. Google says developer and enterprise API access is coming in the following weeks, so this Signal Lens entry marks apiAccess: false until public API details are available.
Capabilities
Gemini Omni Flash is designed for video creation and video editing rather than text chat. Google highlights several workflow patterns:
- conversational video editing where each instruction builds on previous edits
- reference-driven generation from images, text, video, and audio
- physics-aware and knowledge-grounded scene generation
- short explainers and visual transformations from compact prompts
- support for user-owned avatar-style video generation in controlled product surfaces
Google says the Omni family starts with video and will add other output modalities such as image and audio over time.
Technical Details
This is a video-native model, so token-style context and output fields are stored as 0 in Signal Lens and should be treated as N/A in model comparisons.
Current public anchors:
- First model in the Gemini Omni family
- Current model name: Gemini Omni Flash
- Initial output mode: video
- Input references: text, images, video, and audio
- Product rollout: Gemini app, Google Flow, YouTube Shorts, YouTube Create
- API rollout: announced as coming in the following weeks
- SynthID watermarking applies to generated videos
Pricing & Access
Google has not published Gemini Omni Flash API pricing in the checked public sources. Current access is through supported Gemini and Google creative-product surfaces:
- Gemini app for supported Google AI Plus, Pro, and Ultra subscribers
- Google Flow
- YouTube Shorts
- YouTube Create app
Because API access and pricing are not yet public, production developers should not design around Gemini Omni Flash until Google publishes model IDs, limits, pricing, and availability terms.
Best Use Cases
Use Gemini Omni Flash for early creative exploration, video transformations, social clip ideation, reference-driven visual drafts, and explainers where conversational editing is more useful than one-shot video generation.
For developer API work today, compare against Veo 3.1, Sora 2, or grok-imagine-video, all of which have clearer current API paths in this repo.
Comparisons
- Veo 3.1 (Google): Current Google API video route with published preview pricing; Omni Flash is the newer conversational multimodal creation lane.
- Nano Banana Pro (Google): Gemini-native image generation/editing; Omni Flash focuses first on video.
- Sora 2 (OpenAI): OpenAI video generation route with a different product ecosystem and API posture.
- Grok Imagine Video (xAI): xAI API video route with published per-second pricing.