Gemini 3.1 Flash-Lite
Google · Gemini 3.1
Google's newer low-cost Gemini preview tier for high-throughput multimodal assistant and automation workloads.
Overview
Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last verified on April 4, 2026.
Gemini 3.1 Flash-Lite is Google’s newer low-cost Flash preview model in the Gemini API catalog. It extends the Flash-Lite idea into the Gemini 3.x generation for teams that want Google’s newest low-cost path while still keeping a stable 2.5 route available for production fallback.
Capabilities
This model is aimed at classification, extraction, concise summarization, translation, and other high-throughput assistant or automation tasks where budget and response speed matter more than frontier-level reasoning. It is the kind of model you consider when the system needs to do a lot of work reliably, not when every single answer needs the highest available quality ceiling.
Technical Details
Google’s current Gemini API model catalog lists Gemini 3.1 Flash-Lite with:
- 1,048,576 token context window
- 65,536 max output tokens
- multimodal input support across text, images, audio, video, and files
Like the rest of the current Flash line, it is part of Google’s broader agentic and multimodal model surface rather than a text-only budget SKU.
Pricing & Access
Google’s current Gemini API pricing lists Gemini 3.1 Flash-Lite at:
- Input: $0.10 per 1M text, image, or video tokens
- Output: $0.40 per 1M tokens
Audio input is priced separately at a higher rate. Access is through the Gemini API and related Google developer surfaces where preview models are enabled.
Best Use Cases
Use Gemini 3.1 Flash-Lite for ticket triage, structured extraction, batch classification, internal tooling, and cost-sensitive assistants where multimodal support still matters. It is a practical option when Gemini 3 Flash feels richer than the task actually requires.
Comparisons
- Gemini 3 Flash (Google): Better quality ceiling for harder tasks, while Flash-Lite is the cheaper preview route.
- Gemini 2.5 Flash-Lite (Google): Stable production alternative with similar cost posture.
- GPT-5 nano (OpenAI): Another high-volume automation model, with the choice often driven by ecosystem and modality requirements.