Optimized for rapid, high-volume multimodal tasks with a 1M-token context window, delivering strong reasoning and cost efficiency for enterprise workflows.
Use this file to discover all available pages before exploring further.
Try Gemini 2.5 Flash Lite Preview in the Workbench
Run this model interactively, tune parameters, and compare outputs.
Model ID:gemini-2-5-flash-lite-preview-09-2025Gemini 2.5 Flash Lite Preview is a multimodal LLM designed for cost-efficient, high-volume, and latency-sensitive tasks. It excels in rapid processing of large contexts—such as entire codebases or extensive document collections—at a significantly reduced cost compared to other models, while maintaining strong performance in coding, math, science, logic, and high-throughput enterprise operations.Some other noteworthy features of Gemini 2.5 Flash Lite Preview include native support for text, code, image, audio, and video inputs, and a 1 million-token context window, making it suitable for tasks like translation, classification, intelligent routing, and real-time multimodal analysis.
Metric
Value
Parameter Count
Unknown
Mixture of Experts
Unknown
Context Length
1,000,000 tokens
Multilingual
Yes
Quantized*
Yes
Precision*
FP8
*Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.