At this time, we’re introducing Gemini 3.1 Flash-Lite, our quickest and most cost-efficient Gemini 3 collection mannequin. Constructed for high-volume developer workloads at scale, 3.1 Flash-Lite delivers prime quality for its worth and mannequin tier.
Beginning at this time, 3.1 Flash-Lite is rolling out in preview to builders through the Gemini API in Google AI Studio and for enterprises through Vertex AI.
Value-efficiency with out compromise
Priced at simply $0.25/1M enter tokens and $1.50/1M output tokens, 3.1 Flash-Lite delivers enhanced efficiency at a fraction of the price of bigger fashions. It outperforms 2.5 Flash with a 2.5X quicker Time to First Reply Token and 45% enhance in output pace, in accordance with the Synthetic Evaluation benchmark whereas sustaining comparable or higher high quality. This low latency is required for high-frequency workflows, making it an excellent mannequin for builders to construct responsive, real-time experiences.









