Documentation Index
Fetch the complete documentation index at: https://docs.oxen.ai/llms.txt
Use this file to discover all available pages before exploring further.
What is the Inference API?
The Inference API gives you access to hundreds of AI models through a single, consistent interface. Generate text, images, and videos without managing infrastructure or juggling multiple provider SDKs. Capabilities:- Text Generation: Chat completions, tool calling, vision, structured output
- Image Generation: Text-to-image, image-to-image editing
- Video Generation: Text-to-video, image-to-video, reference-to-video, video-to-video editing
Where to find things
| Looking for… | Go to |
|---|---|
| General API info | Keep reading below |
| Getting started fast | Quick Starts |
| Endpoint specs and parameters | API Reference |
| Running inference with a specific model | Model API References |
| Model discovery | Models page |
Quick Starts
Chat
Text generation in minutes
Images
Text-to-image in minutes
Videos
Text-to-video in minutes
Async Queue
Generate in background
API Reference
Chat Completions
Text generation, vision, tool calling
Image Generation
Text-to-image generation
Image Editing
Edit images with text prompts
Video Generation
Text-to-video, image-to-video, multi-shot
Async Queue
Background image/video generation
Models
List, search, and manage models
Individual Model API References
Browse per-model API references
Sample requests, parameter tables, and workbench links for every model.
Individual Model Walkthroughs
Kling O3 Pro: Reference to Video
Multi-shot with references
Kling O3 Pro: Video to Video Edit
Text-guided video edits
Seedance 2.0: Reference to Video
Mixed-reference video
Topaz Starlight Precise 2.5
Upscale and restore to 4K
Authentication
All requests require a bearer token:Base URL
All inference endpoints live under:https://hub.oxen.ai/api/ai. The SDK appends /chat/completions automatically.
Endpoints
| Endpoint | Method | Description |
|---|---|---|
/ai/chat/completions | POST | Text generation (chat, vision, tool use) |
/ai/images/generate | POST | Image generation |
/ai/images/edit | POST | Image editing |
/ai/videos/generate | POST | Video generation |
/ai/queue | POST | Async image/video generation |
/ai/queue | GET | List queued generations |
/ai/queue/:generation_id | GET | Get generation status |
/ai/queue/:generation_id | DELETE | Cancel a queued generation |
/ai/models | GET | List available models |
/ai/models/:id | GET | Get model details and parameter schema |
/ai/models/search | GET | Search models by name |
/ai/models/:id/activate | POST | Activate a custom model deployment |
/ai/models/:id/deactivate | POST | Deactivate a custom model deployment |
Common Parameters
These parameters are accepted across multiple endpoints:| Parameter | Type | Description |
|---|---|---|
model | string | Required. The model to use (e.g. claude-sonnet-4-6, flux-2-dev, kling-video-o3-pro-reference-to-video). |
response_format | string | "url" (default) returns a hosted URL. "b64_json" returns base64-encoded bytes inline. Supported on image and video endpoints. |
target_namespace | string | Namespace to save results and bill to. Defaults to your user. Can be an organization name. |
Discovering Models
List all models, optionally filtered by developer:request_schema field with the complete parameter definitions, types, defaults, and constraints for that model.
Pricing
Pricing varies by model:| Method | How it works | Examples |
|---|---|---|
token | Per input/output token | GPT, Claude, Gemini |
time | Per second of compute time | Custom models, Llama, Qwen |
per_image | Fixed cost per image | FLUX, DALL-E |
per_video_output_second | Cost per second of output video | Kling, Sora |
input_cost_per_token, output_cost_per_token, cost_per_image, cost_per_second, cost_per_second_with_audio, cost_per_second_high_res.
Error Format
Errors use one of two formats:unauthenticated, invalid_params, resource_not_found, unknown_error.
Need help? Join our Discord community.