> ## Documentation Index
> Fetch the complete documentation index at: https://docs.oxen.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Gemini 2.5 Flash Lite Preview

> Optimized for rapid, high-volume multimodal tasks with a 1M-token context window, delivering strong reasoning and cost efficiency for enterprise workflows.

<CardGroup cols={1}>
  <Card title="Try Gemini 2.5 Flash Lite Preview in the Workbench" icon="flask" href="https://www.oxen.ai/ai/workbench?model=gemini-2-5-flash-lite-preview-09-2025">
    Run this model interactively, tune parameters, and compare outputs.
  </Card>
</CardGroup>

**Model ID:** `gemini-2-5-flash-lite-preview-09-2025`

Gemini 2.5 Flash Lite Preview is a multimodal LLM designed for cost-efficient, high-volume, and latency-sensitive tasks. It excels in rapid processing of large contexts—such as entire codebases or extensive document collections—at a significantly reduced cost compared to other models, while maintaining strong performance in coding, math, science, logic, and high-throughput enterprise operations.

Some other noteworthy features of Gemini 2.5 Flash Lite Preview include native support for text, code, image, audio, and video inputs, and a 1 million-token context window, making it suitable for tasks like translation, classification, intelligent routing, and real-time multimodal analysis.

| Metric             | Value            |
| ------------------ | ---------------- |
| Parameter Count    | Unknown          |
| Mixture of Experts | Unknown          |
| Context Length     | 1,000,000 tokens |
| Multilingual       | Yes              |
| Quantized\*        | Yes              |
| Precision\*        | FP8              |

\**Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.*

## Example request

<Tip>
  Use the [Workbench](https://www.oxen.ai/ai/workbench?model=gemini-2-5-flash-lite-preview-09-2025) as a request builder: configure parameters for this model in the UI, then open the **API** tab to copy the exact cURL or Python call.
</Tip>

<Tabs>
  <Tab title="Minimal">
    <CodeGroup>
      ```bash cURL theme={null}
      curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $OXEN_API_KEY" \
        -d '{
        "model": "gemini-2-5-flash-lite-preview-09-2025",
        "messages": [
          {
            "role": "user",
            "content": "Hello, what can you do?"
          }
        ]
      }'
      ```

      ```python Python theme={null}
      import os
      import requests

      response = requests.post(
          "https://hub.oxen.ai/api/ai/chat/completions",
          headers={
              "Content-Type": "application/json",
              "Authorization": f"Bearer {os.environ['OXEN_API_KEY']}",
          },
          json={
              "model": "gemini-2-5-flash-lite-preview-09-2025",
              "messages": [
                  {
                      "role": "user",
                      "content": "Hello, what can you do?"
                  }
              ]
          },
      )
      response.raise_for_status()
      print(response.json())
      ```
    </CodeGroup>
  </Tab>

  <Tab title="Basic parameters">
    <CodeGroup>
      ```bash cURL theme={null}
      curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $OXEN_API_KEY" \
        -d '{
        "model": "gemini-2-5-flash-lite-preview-09-2025",
        "messages": [
          {
            "role": "user",
            "content": "Hello, what can you do?"
          }
        ],
        "temperature": 0.7,
        "max_tokens": 1024,
        "stream": false
      }'
      ```

      ```python Python theme={null}
      import os
      import requests

      response = requests.post(
          "https://hub.oxen.ai/api/ai/chat/completions",
          headers={
              "Content-Type": "application/json",
              "Authorization": f"Bearer {os.environ['OXEN_API_KEY']}",
          },
          json={
              "model": "gemini-2-5-flash-lite-preview-09-2025",
              "messages": [
                  {
                      "role": "user",
                      "content": "Hello, what can you do?"
                  }
              ],
              "temperature": 0.7,
              "max_tokens": 1024,
              "stream": false
          },
      )
      response.raise_for_status()
      print(response.json())
      ```
    </CodeGroup>
  </Tab>

  <Tab title="All parameters">
    <CodeGroup>
      ```bash cURL theme={null}
      curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $OXEN_API_KEY" \
        -d '{
        "model": "gemini-2-5-flash-lite-preview-09-2025",
        "messages": [
          {
            "role": "user",
            "content": "Hello, what can you do?"
          }
        ],
        "temperature": 0.7,
        "max_tokens": 1024,
        "stream": false,
        "top_p": 1.0
      }'
      ```

      ```python Python theme={null}
      import os
      import requests

      response = requests.post(
          "https://hub.oxen.ai/api/ai/chat/completions",
          headers={
              "Content-Type": "application/json",
              "Authorization": f"Bearer {os.environ['OXEN_API_KEY']}",
          },
          json={
              "model": "gemini-2-5-flash-lite-preview-09-2025",
              "messages": [
                  {
                      "role": "user",
                      "content": "Hello, what can you do?"
                  }
              ],
              "temperature": 0.7,
              "max_tokens": 1024,
              "stream": false,
              "top_p": 1.0
          },
      )
      response.raise_for_status()
      print(response.json())
      ```
    </CodeGroup>
  </Tab>
</Tabs>

## Fetch model details

The [models endpoint](/inference-api/reference/models/overview) returns the full model object, including its `json_request_schema`.

```bash theme={null}
curl -H "Authorization: Bearer $OXEN_API_KEY" https://hub.oxen.ai/api/ai/models/gemini-2-5-flash-lite-preview-09-2025
```

## Request parameters

This model follows the standard OpenAI chat completions request body. See the [chat completions reference](../inference-api.mdx) for the full parameter list.
