Gemini 3.5 Flash
Google · Gemini 3.5
Google's stable Gemini 3.5 Flash model for fast frontier multimodal, coding, and long-horizon agent workflows.
Overview
Freshness note: Model capabilities, limits, and pricing can change quickly. This profile is a point-in-time snapshot last verified on May 24, 2026.
Gemini 3.5 Flash is Google’s stable fast frontier model for the agentic Gemini 3.5 generation. Google launched the model at I/O 2026 on May 19, positioning it as the first model in the 3.5 family and the current high-speed route for complex coding, multimodal reasoning, and long-horizon agent workflows.
The practical change is that Flash is no longer just the cheaper Gemini lane. Gemini 3.5 Flash is now Google’s main stable fast model for sustained work: planning, coding, tool use, multimodal understanding, and sub-agent style execution through Google Antigravity and the Gemini API.
Capabilities
Gemini 3.5 Flash supports text, image, video, audio, and PDF inputs with text output. Google’s model page lists support for function calling, structured outputs, search grounding, Google Maps grounding, URL context, file search, code execution, batch, caching, flex inference, and priority inference.
Google’s launch framing emphasizes agentic coding and long-horizon tasks. The model is designed for rapid agent loops where the system needs to inspect context, reason, use tools, update work, and keep moving without escalating immediately to a slower Pro-tier route. It is also the model behind new Gemini app and Search agentic experiences such as Gemini Spark.
Technical Details
Current public API limits:
- Model ID:
gemini-3.5-flash - Input token limit: 1,048,576
- Output token limit: 65,536
- Input types: text, image, video, audio, and PDF
- Output type: text
- Stable version:
gemini-3.5-flash - Latest update: May 2026
- Published knowledge cutoff: January 2025
Computer use and Live API are not listed as supported on this model page. For browser-control work, use Google’s Computer Use or Antigravity Agent surfaces where appropriate.
Pricing & Access
Google’s current standard paid Gemini API pricing for Gemini 3.5 Flash is:
- Input: $1.50 per 1M tokens
- Output, including thinking tokens: $9.00 per 1M tokens
Context caching, batch, flex, and priority pricing are also available with separate rates. Access is through the Gemini API, Google AI Studio, Android Studio, Google Antigravity, Gemini Enterprise Agent Platform, Gemini Enterprise, the Gemini app, and AI Mode in Search where available.
Best Use Cases
Choose Gemini 3.5 Flash for agentic coding, document-heavy app workflows, multimodal assistants, extraction pipelines, UI generation loops, and production systems that need strong reasoning without Pro-tier latency or cost.
It is less appropriate for specialized browser control, realtime voice agents, image generation, or video generation. Google has separate Gemini Live, Computer Use, Nano Banana, Veo, and Gemini Omni routes for those workloads.
Comparisons
- Gemini 3 Flash (Google): Older preview fast route; Gemini 3.5 Flash is the newer stable model.
- Gemini 3.1 Pro Preview (Google): Higher-cost preview Pro lane for harder reasoning and multimodal tasks.
- Gemini 2.5 Flash (Google): Stable older Flash route that may remain useful for compatibility and cost-sensitive deployments.
- GPT-5.5 (OpenAI): Strong OpenAI frontier alternative, especially for Codex and ChatGPT-native agentic workflows.