> ## Documentation Index
> Fetch the complete documentation index at: https://docs.oxen.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Gemini 3.1 Flash-Lite

> Fast, low-cost Gemini 3.1 model for high-throughput multimodal workloads, with configurable reasoning and a 1M-token context window.

<CardGroup cols={1}>
  <Card title="Try Gemini 3.1 Flash-Lite in the Workbench" icon="flask" href="https://www.oxen.ai/ai/workbench?model=gemini-3-1-flash-lite-preview">
    Run this model interactively, tune parameters, and compare outputs.
  </Card>
</CardGroup>

**Model ID:** `gemini-3-1-flash-lite-preview`

Gemini 3.1 Flash-Lite Preview is Google's fastest and most cost-efficient Gemini 3.1 model for high-volume workloads. It is optimized for low-latency, large-scale tasks where responsiveness and cost control are critical.

Some other noteworthy features of Gemini 3.1 Flash-Lite Preview include configurable thinking levels, strong instruction-following for production pipelines, and multimodal support suitable for translation, moderation, UI generation, dashboards, and simulation-style workflows.

| Metric             | Value            |
| ------------------ | ---------------- |
| Parameter Count    | Unknown          |
| Mixture of Experts | Unknown          |
| Context Length     | 1,048,576 tokens |
| Multilingual       | Yes              |
| Quantized\*        | Unknown          |

\**Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.*

## Example request

<Tip>
  Use the [Workbench](https://www.oxen.ai/ai/workbench?model=gemini-3-1-flash-lite-preview) as a request builder: configure parameters for this model in the UI, then open the **API** tab to copy the exact cURL or Python call.
</Tip>

<Tabs>
  <Tab title="Minimal">
    <CodeGroup>
      ```bash cURL theme={null}
      curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $OXEN_API_KEY" \
        -d '{
        "model": "gemini-3-1-flash-lite-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello, what can you do?"
          }
        ]
      }'
      ```

      ```python Python theme={null}
      import os
      import requests

      response = requests.post(
          "https://hub.oxen.ai/api/ai/chat/completions",
          headers={
              "Content-Type": "application/json",
              "Authorization": f"Bearer {os.environ['OXEN_API_KEY']}",
          },
          json={
              "model": "gemini-3-1-flash-lite-preview",
              "messages": [
                  {
                      "role": "user",
                      "content": "Hello, what can you do?"
                  }
              ]
          },
      )
      response.raise_for_status()
      print(response.json())
      ```
    </CodeGroup>
  </Tab>

  <Tab title="Basic parameters">
    <CodeGroup>
      ```bash cURL theme={null}
      curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $OXEN_API_KEY" \
        -d '{
        "model": "gemini-3-1-flash-lite-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello, what can you do?"
          }
        ],
        "temperature": 0.7,
        "max_tokens": 1024,
        "stream": false
      }'
      ```

      ```python Python theme={null}
      import os
      import requests

      response = requests.post(
          "https://hub.oxen.ai/api/ai/chat/completions",
          headers={
              "Content-Type": "application/json",
              "Authorization": f"Bearer {os.environ['OXEN_API_KEY']}",
          },
          json={
              "model": "gemini-3-1-flash-lite-preview",
              "messages": [
                  {
                      "role": "user",
                      "content": "Hello, what can you do?"
                  }
              ],
              "temperature": 0.7,
              "max_tokens": 1024,
              "stream": false
          },
      )
      response.raise_for_status()
      print(response.json())
      ```
    </CodeGroup>
  </Tab>

  <Tab title="All parameters">
    <CodeGroup>
      ```bash cURL theme={null}
      curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $OXEN_API_KEY" \
        -d '{
        "model": "gemini-3-1-flash-lite-preview",
        "messages": [
          {
            "role": "user",
            "content": "Hello, what can you do?"
          }
        ],
        "temperature": 0.7,
        "max_tokens": 1024,
        "stream": false,
        "top_p": 1.0
      }'
      ```

      ```python Python theme={null}
      import os
      import requests

      response = requests.post(
          "https://hub.oxen.ai/api/ai/chat/completions",
          headers={
              "Content-Type": "application/json",
              "Authorization": f"Bearer {os.environ['OXEN_API_KEY']}",
          },
          json={
              "model": "gemini-3-1-flash-lite-preview",
              "messages": [
                  {
                      "role": "user",
                      "content": "Hello, what can you do?"
                  }
              ],
              "temperature": 0.7,
              "max_tokens": 1024,
              "stream": false,
              "top_p": 1.0
          },
      )
      response.raise_for_status()
      print(response.json())
      ```
    </CodeGroup>
  </Tab>
</Tabs>

## Fetch model details

The [models endpoint](/inference-api/reference/models/overview) returns the full model object, including its `json_request_schema`.

```bash theme={null}
curl -H "Authorization: Bearer $OXEN_API_KEY" https://hub.oxen.ai/api/ai/models/gemini-3-1-flash-lite-preview
```

## Request parameters

This model follows the standard OpenAI chat completions request body. See the [chat completions reference](../inference-api.mdx) for the full parameter list.
