Gemini 3.1 Flash-Lite

Try Gemini 3.1 Flash-Lite in the Workbench

Run this model interactively, tune parameters, and compare outputs.

Model ID: gemini-3-1-flash-lite-preview Gemini 3.1 Flash-Lite Preview is Google’s fastest and most cost-efficient Gemini 3.1 model for high-volume workloads. It is optimized for low-latency, large-scale tasks where responsiveness and cost control are critical. Some other noteworthy features of Gemini 3.1 Flash-Lite Preview include configurable thinking levels, strong instruction-following for production pipelines, and multimodal support suitable for translation, moderation, UI generation, dashboards, and simulation-style workflows.

Metric	Value
Parameter Count	Unknown
Mixture of Experts	Unknown
Context Length	1,048,576 tokens
Multilingual	Yes
Quantized*	Unknown

*Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.

Example request

Use the Workbench as a request builder: configure parameters for this model in the UI, then open the API tab to copy the exact cURL or Python call.

Minimal
Basic parameters
All parameters

curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OXEN_API_KEY" \
  -d '{
  "model": "gemini-3-1-flash-lite-preview",
  "messages": [
    {
      "role": "user",
      "content": "Hello, what can you do?"
    }
  ]
}'

curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OXEN_API_KEY" \
  -d '{
  "model": "gemini-3-1-flash-lite-preview",
  "messages": [
    {
      "role": "user",
      "content": "Hello, what can you do?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1024,
  "stream": false
}'

curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OXEN_API_KEY" \
  -d '{
  "model": "gemini-3-1-flash-lite-preview",
  "messages": [
    {
      "role": "user",
      "content": "Hello, what can you do?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1024,
  "stream": false,
  "top_p": 1.0
}'

Fetch model details

The models endpoint returns the full model object, including its json_request_schema.

curl -H "Authorization: Bearer $OXEN_API_KEY" https://hub.oxen.ai/api/ai/models/gemini-3-1-flash-lite-preview

Request parameters

This model follows the standard OpenAI chat completions request body. See the chat completions reference for the full parameter list.

Try Gemini 3.1 Flash-Lite in the Workbench

​Example request

​Fetch model details

​Request parameters

Example request

Fetch model details

Request parameters