Inference API Overview

What is the Inference API?

The Inference API gives you access to hundreds of AI models through a single, consistent interface. Generate text, images, and videos without managing infrastructure or juggling multiple provider SDKs. Capabilities:

Text Generation: Chat completions, tool calling, vision, audio, documents (PDFs), structured output
Image Generation: Text-to-image, image-to-image editing
Video Generation: Text-to-video, image-to-video, reference-to-video, video-to-video editing

Where to find things

Looking for…	Go to
General API info	Keep reading below
Getting started fast	Quick Starts
Endpoint specs and parameters	API Reference
Running inference with a specific model	Model API References
Model discovery	Models page

Quick Starts

Chat

Text generation in minutes

Images

Text-to-image in minutes

Videos

Text-to-video in minutes

Async Queue

Generate in background

API Reference

Chat Completions

Text generation, vision, audio, documents, tool calling

Image Generation

Text-to-image generation

Image Editing

Edit images with text prompts

Video Generation

Text-to-video, image-to-video, multi-shot

Async Queue

Background image/video generation

Models

List, search, and manage models

Individual Model API References

Browse per-model API references

Sample requests, parameter tables, and workbench links for every model.

Individual Model Walkthroughs

Kling O3 Pro: Reference to Video

Multi-shot with references

Kling O3 Pro: Video to Video Edit

Text-guided video edits

Seedance 2.0: Reference to Video

Mixed-reference video

Topaz Starlight Precise 2.5

Upscale and restore to 4K

Authentication

All requests require a bearer token:

curl -H "Authorization: Bearer YOUR_API_KEY" \
  https://hub.oxen.ai/api/ai/...

Get your API key from your account settings.

Base URL

All inference endpoints live under:

https://hub.oxen.ai/api/ai

If you’re using the OpenAI SDK, set the base URL to https://hub.oxen.ai/api/ai. The SDK appends /chat/completions automatically.

Endpoints

Endpoint	Method	Description
`/ai/chat/completions`	POST	Text generation (chat, vision, documents, tool use)
`/ai/images/generate`	POST	Image generation
`/ai/images/edit`	POST	Image editing
`/ai/videos/generate`	POST	Video generation
`/ai/queue`	POST	Async image/video generation
`/ai/queue`	GET	List generations (active by default, filterable by status)
`/ai/queue/:generation_id`	GET	Get generation status and result
`/ai/queue/:generation_id`	DELETE	Cancel a queued generation
`/ai/models`	GET	List available models
`/ai/models/:id`	GET	Get model details and parameter schema
`/ai/models/search`	GET	Search models by name
`/ai/models/:id/activate`	POST	Activate a custom model deployment
`/ai/models/:id/deactivate`	POST	Deactivate a custom model deployment

Common Parameters

These parameters are accepted across multiple endpoints:

Parameter	Type	Description
`model`	string	Required. The model to use (e.g. `claude-sonnet-4-6`, `flux-2-dev`, `kling-video-o3-pro-reference-to-video`).
`response_format`	string	`"url"` (default) returns a hosted URL. `"b64_json"` returns base64-encoded bytes inline. Supported on image and video endpoints.
`target_namespace`	string	Namespace to save results and bill to. Defaults to your user. Can be an organization name.

Discovering Models

List all models, optionally filtered by developer:

# All models
curl -H "Authorization: Bearer $OXEN_API_KEY" \
  "https://hub.oxen.ai/api/ai/models"

# Search by name
curl -H "Authorization: Bearer $OXEN_API_KEY" \
  "https://hub.oxen.ai/api/ai/models/search?search=kling"

Get full details for a specific model (including its parameter schema):

curl -H "Authorization: Bearer $OXEN_API_KEY" \
  "https://hub.oxen.ai/api/ai/models/kling-video-o3-pro-reference-to-video"

The response includes a request_schema field with the complete parameter definitions, types, defaults, and constraints for that model.

Pricing

Pricing varies by model:

Method	How it works	Examples
`token`	Per input/output token	GPT, Claude, Gemini
`time`	Per second of compute time	Custom models, Llama, Qwen
`per_image`	Fixed cost per image	FLUX, DALL-E
`per_video_output_second`	Cost per second of output video	Kling, Sora

Check the model detail endpoint for exact pricing. Relevant fields: input_cost_per_token, output_cost_per_token, cost_per_image, cost_per_second, cost_per_second_with_audio, cost_per_second_high_res.

Error Format

Errors use one of two formats:

{
  "error": {
    "type": "invalid_params",
    "title": "Invalid parameters supplied, please check your request and try again.",
    "detail": "Specific error details"
  },
  "status": "error",
  "status_message": "invalid_params"
}

{
  "error": {
    "message": "Model not found: bad-model-name"
  }
}

Common error types: unauthenticated, invalid_params, resource_not_found, unknown_error. Need help? Join our Discord community.

​What is the Inference API?

​Where to find things

​Quick Starts

Chat

Images

Videos

Async Queue

​API Reference

Chat Completions

Image Generation

Image Editing

Video Generation

Async Queue

Models

​Individual Model API References

Browse per-model API references

​Individual Model Walkthroughs

Kling O3 Pro: Reference to Video

Kling O3 Pro: Video to Video Edit

Seedance 2.0: Reference to Video

Topaz Starlight Precise 2.5

​Authentication

​Base URL

​Endpoints

​Common Parameters

​Discovering Models

​Pricing

​Error Format

What is the Inference API?

Where to find things

Quick Starts

API Reference

Individual Model API References

Individual Model Walkthroughs

Authentication

Base URL

Endpoints

Common Parameters

Discovering Models

Pricing

Error Format