Try Gemini 2.5 Flash Lite Preview in the Workbench
Run this model interactively, tune parameters, and compare outputs.
gemini-2-5-flash-lite-preview-09-2025
Gemini 2.5 Flash Lite Preview is a multimodal LLM designed for cost-efficient, high-volume, and latency-sensitive tasks. It excels in rapid processing of large contexts—such as entire codebases or extensive document collections—at a significantly reduced cost compared to other models, while maintaining strong performance in coding, math, science, logic, and high-throughput enterprise operations.
Some other noteworthy features of Gemini 2.5 Flash Lite Preview include native support for text, code, image, audio, and video inputs, and a 1 million-token context window, making it suitable for tasks like translation, classification, intelligent routing, and real-time multimodal analysis.
| Metric | Value |
|---|---|
| Parameter Count | Unknown |
| Mixture of Experts | Unknown |
| Context Length | 1,000,000 tokens |
| Multilingual | Yes |
| Quantized* | Yes |
| Precision* | FP8 |
Example request
- Minimal
- Basic parameters
- All parameters
Fetch model details
The models endpoint returns the full model object, including itsjson_request_schema.