> ## Documentation Index
> Fetch the complete documentation index at: https://docs.oxen.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Inference API Overview

> Run inference on hundreds of AI models through a unified API for text, image, and video generation

## What is the Inference API?

The Inference API gives you access to hundreds of AI models through a single, consistent interface. Generate text, images, and videos without managing infrastructure or juggling multiple provider SDKs.

**Capabilities:**

* **Text Generation**: Chat completions, tool calling, vision, structured output
* **Image Generation**: Text-to-image, image-to-image editing
* **Video Generation**: Text-to-video, image-to-video, reference-to-video, video-to-video editing

## Where to find things

| Looking for...                          | Go to                                                             |
| --------------------------------------- | ----------------------------------------------------------------- |
| General API info                        | [Keep reading below](#authentication)                             |
| Getting started fast                    | [Quick Starts](#quick-starts)                                     |
| Endpoint specs and parameters           | [API Reference](#api-reference)                                   |
| Running inference with a specific model | [Model API References](/inference-api/reference/model-references) |
| Model discovery                         | [Models page](https://www.oxen.ai/ai/models)                      |

## Quick Starts

<CardGroup cols={4}>
  <Card title="Chat" icon="message" href="/inference-api/quickstart/chat">
    Text generation in minutes
  </Card>

  <Card title="Images" icon="image" href="/inference-api/quickstart/image-generation">
    Text-to-image in minutes
  </Card>

  <Card title="Videos" icon="video" href="/inference-api/quickstart/video-generation">
    Text-to-video in minutes
  </Card>

  <Card title="Async Queue" icon="layer-group" href="/inference-api/quickstart/async-queue">
    Generate in background
  </Card>
</CardGroup>

## API Reference

<CardGroup cols={3}>
  <Card title="Chat Completions" icon="message" href="/inference-api/reference/chat_completions">
    Text generation, vision, tool calling
  </Card>

  <Card title="Image Generation" icon="image" href="/inference-api/reference/image_generation">
    Text-to-image generation
  </Card>

  <Card title="Image Editing" icon="wand-magic-sparkles" href="/inference-api/reference/image_editing">
    Edit images with text prompts
  </Card>

  <Card title="Video Generation" icon="video" href="/inference-api/reference/video_generation">
    Text-to-video, image-to-video, multi-shot
  </Card>

  <Card title="Async Queue" icon="layer-group" href="/inference-api/reference/async_queue">
    Background image/video generation
  </Card>

  <Card title="Models" icon="cube" href="/inference-api/reference/models/overview">
    List, search, and manage models
  </Card>
</CardGroup>

## Individual Model API References

<CardGroup cols={1}>
  <Card title="Browse per-model API references" icon="book" href="/inference-api/reference/model-references">
    Sample requests, parameter tables, and workbench links for every model.
  </Card>
</CardGroup>

## Individual Model Walkthroughs

<CardGroup cols={4}>
  <Card title="Kling O3 Pro: Reference to Video" icon="film" href="/inference-api/reference/models/walkthroughs/kling_o3_pro_reference_to_video">
    Multi-shot with references
  </Card>

  <Card title="Kling O3 Pro: Video to Video Edit" icon="pen-to-square" href="/inference-api/reference/models/walkthroughs/kling_o3_pro_video_to_video_edit">
    Text-guided video edits
  </Card>

  <Card title="Seedance 2.0: Reference to Video" icon="seedling" href="/inference-api/reference/models/walkthroughs/seedance_2_reference_to_video">
    Mixed-reference video
  </Card>

  <Card title="Topaz Starlight Precise 2.5" icon="arrow-up-right-dots" href="/inference-api/reference/models/walkthroughs/topaz_starlight_precise_2_5">
    Upscale and restore to 4K
  </Card>
</CardGroup>

***

## Authentication

All requests require a bearer token:

```bash theme={null}
curl -H "Authorization: Bearer YOUR_API_KEY" \
  https://hub.oxen.ai/api/ai/...
```

Get your API key from your [account settings](https://oxen.ai/settings/profile).

## Base URL

All inference endpoints live under:

```
https://hub.oxen.ai/api/ai
```

If you're using the OpenAI SDK, set the base URL to `https://hub.oxen.ai/api/ai`. The SDK appends `/chat/completions` automatically.

## Endpoints

| Endpoint                    | Method | Description                                                |
| --------------------------- | ------ | ---------------------------------------------------------- |
| `/ai/chat/completions`      | POST   | Text generation (chat, vision, tool use)                   |
| `/ai/images/generate`       | POST   | Image generation                                           |
| `/ai/images/edit`           | POST   | Image editing                                              |
| `/ai/videos/generate`       | POST   | Video generation                                           |
| `/ai/queue`                 | POST   | Async image/video generation                               |
| `/ai/queue`                 | GET    | List generations (active by default, filterable by status) |
| `/ai/queue/:generation_id`  | GET    | Get generation status and result                           |
| `/ai/queue/:generation_id`  | DELETE | Cancel a queued generation                                 |
| `/ai/models`                | GET    | List available models                                      |
| `/ai/models/:id`            | GET    | Get model details and parameter schema                     |
| `/ai/models/search`         | GET    | Search models by name                                      |
| `/ai/models/:id/activate`   | POST   | Activate a custom model deployment                         |
| `/ai/models/:id/deactivate` | POST   | Deactivate a custom model deployment                       |

## Common Parameters

These parameters are accepted across multiple endpoints:

| Parameter          | Type   | Description                                                                                                                       |
| ------------------ | ------ | --------------------------------------------------------------------------------------------------------------------------------- |
| `model`            | string | Required. The model to use (e.g. `claude-sonnet-4-6`, `flux-2-dev`, `kling-video-o3-pro-reference-to-video`).                     |
| `response_format`  | string | `"url"` (default) returns a hosted URL. `"b64_json"` returns base64-encoded bytes inline. Supported on image and video endpoints. |
| `target_namespace` | string | Namespace to save results and bill to. Defaults to your user. Can be an organization name.                                        |

## Discovering Models

List all models, optionally filtered by developer:

```bash theme={null}
# All models
curl -H "Authorization: Bearer $OXEN_API_KEY" \
  "https://hub.oxen.ai/api/ai/models"

# Search by name
curl -H "Authorization: Bearer $OXEN_API_KEY" \
  "https://hub.oxen.ai/api/ai/models/search?search=kling"
```

Get full details for a specific model (including its parameter schema):

```bash theme={null}
curl -H "Authorization: Bearer $OXEN_API_KEY" \
  "https://hub.oxen.ai/api/ai/models/kling-video-o3-pro-reference-to-video"
```

The response includes a `request_schema` field with the complete parameter definitions, types, defaults, and constraints for that model.

## Pricing

Pricing varies by model:

| Method                    | How it works                    | Examples                   |
| ------------------------- | ------------------------------- | -------------------------- |
| `token`                   | Per input/output token          | GPT, Claude, Gemini        |
| `time`                    | Per second of compute time      | Custom models, Llama, Qwen |
| `per_image`               | Fixed cost per image            | FLUX, DALL-E               |
| `per_video_output_second` | Cost per second of output video | Kling, Sora                |

Check the [model detail endpoint](/inference-api/reference/models/overview#retrieve-model) for exact pricing. Relevant fields: `input_cost_per_token`, `output_cost_per_token`, `cost_per_image`, `cost_per_second`, `cost_per_second_with_audio`, `cost_per_second_high_res`.

## Error Format

Errors use one of two formats:

```json theme={null}
{
  "error": {
    "type": "invalid_params",
    "title": "Invalid parameters supplied, please check your request and try again.",
    "detail": "Specific error details"
  },
  "status": "error",
  "status_message": "invalid_params"
}
```

```json theme={null}
{
  "error": {
    "message": "Model not found: bad-model-name"
  }
}
```

Common error types: `unauthenticated`, `invalid_params`, `resource_not_found`, `unknown_error`.

Need help? Join our [Discord community](https://discord.com/invite/s3tBEn7Ptg).
