Skip to content

Gemini 2.5 Flash-Lite

gemini-2.5-flash-lite
Google · chat · api · seen 10m ago
GA Alert me on changes
Context
1M
Max output
65.5K
Input $/1M
$0.10
Output $/1M
$0.40
Modalities
text
Released
22 Jul 2025
AI summary
● machine-written

Google releases Gemini 2.5 Flash-Lite, low-cost text model with 1M context window

Gemini 2.5 Flash-Lite is Google's most cost-effective multimodal model designed for high-frequency, simple tasks. It delivers the fastest performance for basic classification, data extraction, and ultra-low-latency applications where budget and speed are primary constraints. The model supports a 1M token context window and outputs up to 65.5K tokens per response.

What's new
  • Released July 22, 2025
  • Pricing: $0.10 per million input tokens, $0.40 per million output tokens
  • Supports text, image, video, audio, and PDF inputs
  • 1M token context window with 65.5K token output limit
  • Caching, code execution, and function calling enabled
Best for
High-volume classification tasksSimple data extractionUltra-low latency applications with tight budgetsHigh-frequency lightweight inference
Sources

Source: https://ai.google.dev/gemini-api/docs/models/gemini-2.5-flash-lite