> ## Documentation Index
> Fetch the complete documentation index at: https://docs.oxen.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# DeepSeek V4 Flash

> Efficient MoE model, 1M context

<CardGroup cols={1}>
  <Card title="Try DeepSeek V4 Flash in the Workbench" icon="flask" href="https://www.oxen.ai/ai/workbench?model=deepseek-v4-flash">
    Run this model interactively, tune parameters, and compare outputs.
  </Card>
</CardGroup>

**Model ID:** `deepseek-v4-flash`

DeepSeek V4 Flash is the efficiency variant of the V4 family. With 284B total parameters and 13B active per token, it targets fast, cheap inference while keeping the family's strong reasoning and tool-use behaviour. Pairs naturally with V4 Pro: route the hard prompts to Pro, everyday traffic to Flash.

| Metric                 | Value                           |
| ---------------------- | ------------------------------- |
| Parameter Count        | 284 billion (13 billion active) |
| Mixture of Experts     | Yes                             |
| Active Parameter Count | 13 billion                      |
| Context Length         | 1,048,576 tokens                |
| Multilingual           | Yes                             |
| Tool Use               | Yes                             |
| Structured Outputs     | Yes                             |

## Example request

<Tip>
  Use the [Workbench](https://www.oxen.ai/ai/workbench?model=deepseek-v4-flash) as a request builder: configure parameters for this model in the UI, then open the **API** tab to copy the exact cURL or Python call.
</Tip>

<Tabs>
  <Tab title="Minimal">
    <CodeGroup>
      ```bash cURL theme={null}
      curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $OXEN_API_KEY" \
        -d '{
        "model": "deepseek-v4-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello, what can you do?"
          }
        ]
      }'
      ```

      ```python Python theme={null}
      import os
      import requests

      response = requests.post(
          "https://hub.oxen.ai/api/ai/chat/completions",
          headers={
              "Content-Type": "application/json",
              "Authorization": f"Bearer {os.environ['OXEN_API_KEY']}",
          },
          json={
              "model": "deepseek-v4-flash",
              "messages": [
                  {
                      "role": "user",
                      "content": "Hello, what can you do?"
                  }
              ]
          },
      )
      response.raise_for_status()
      print(response.json())
      ```
    </CodeGroup>
  </Tab>

  <Tab title="Basic parameters">
    <CodeGroup>
      ```bash cURL theme={null}
      curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $OXEN_API_KEY" \
        -d '{
        "model": "deepseek-v4-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello, what can you do?"
          }
        ],
        "temperature": 0.7,
        "max_tokens": 1024,
        "stream": false
      }'
      ```

      ```python Python theme={null}
      import os
      import requests

      response = requests.post(
          "https://hub.oxen.ai/api/ai/chat/completions",
          headers={
              "Content-Type": "application/json",
              "Authorization": f"Bearer {os.environ['OXEN_API_KEY']}",
          },
          json={
              "model": "deepseek-v4-flash",
              "messages": [
                  {
                      "role": "user",
                      "content": "Hello, what can you do?"
                  }
              ],
              "temperature": 0.7,
              "max_tokens": 1024,
              "stream": false
          },
      )
      response.raise_for_status()
      print(response.json())
      ```
    </CodeGroup>
  </Tab>

  <Tab title="All parameters">
    <CodeGroup>
      ```bash cURL theme={null}
      curl -X POST https://hub.oxen.ai/api/ai/chat/completions \
        -H "Content-Type: application/json" \
        -H "Authorization: Bearer $OXEN_API_KEY" \
        -d '{
        "model": "deepseek-v4-flash",
        "messages": [
          {
            "role": "user",
            "content": "Hello, what can you do?"
          }
        ],
        "temperature": 0.7,
        "max_tokens": 1024,
        "stream": false,
        "top_p": 1.0
      }'
      ```

      ```python Python theme={null}
      import os
      import requests

      response = requests.post(
          "https://hub.oxen.ai/api/ai/chat/completions",
          headers={
              "Content-Type": "application/json",
              "Authorization": f"Bearer {os.environ['OXEN_API_KEY']}",
          },
          json={
              "model": "deepseek-v4-flash",
              "messages": [
                  {
                      "role": "user",
                      "content": "Hello, what can you do?"
                  }
              ],
              "temperature": 0.7,
              "max_tokens": 1024,
              "stream": false,
              "top_p": 1.0
          },
      )
      response.raise_for_status()
      print(response.json())
      ```
    </CodeGroup>
  </Tab>
</Tabs>

## Fetch model details

The [models endpoint](/inference-api/reference/models/overview) returns the full model object, including its `json_request_schema`.

```bash theme={null}
curl -H "Authorization: Bearer $OXEN_API_KEY" https://hub.oxen.ai/api/ai/models/deepseek-v4-flash
```

## Request parameters

This model follows the standard OpenAI chat completions request body. See the [chat completions reference](../inference-api.mdx) for the full parameter list.
