In the Gemini API and Vertex AI, Gemini 3 Flash is priced at $0.50/1M input tokens and $3/1M output tokens (audio input remains at $1/1M input tokens). It comes standard with context caching, allowing for 90% cost reductions in cases with repeated token use over certain thresholds. Similarly, 3 Flash is also available today with the Batch API, allowing for 50% cost savings and much higher rate limits for asynchronous processing. For synchronous and near real-time use cases, Paid API customers also have access to production-ready rate limits.
Gemini 3 Flash in action
Gemini 3 Flash is now integrated into many of our products and early customers are enthusiastic about it. With each new Flash model, it’s exciting to see what new use cases are created.
For coding
Gemini 3 Flash has even better coding and agent capabilities over previous versions, enabling rapid, iterative development, outperforming 3 Pro’s agentic coding skill (78% on SWE-bench Verified) but operating faster for quick iterations. Today, 3 Flash is rolling out to users in Google Antigravity, our new agentic development platform, to provide intelligent coding assistance that keeps pace with your train of thought.