Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.oxen.ai/llms.txt

Use this file to discover all available pages before exploring further.

Try Kling O3 4K: Image-to-Video in the Workbench

Run this model interactively, tune parameters, and compare outputs.
Model ID: kling-video-o3-4k-image-to-video Kling O3 4K Image-to-Video animates a static image into native 4K (3840x2160) motion in a single pass, with no upscaling step. The output’s aspect ratio is inherited from the input image, and an optional end frame can anchor the final shot for a specific transition. Supports clips from 3 to 15 seconds, optional native synchronized audio, and preserves the source image’s stylistic look, color, and lighting throughout the clip.

Example request

Use the Workbench as a request builder: configure parameters for this model in the UI, then open the API tab to copy the exact cURL or Python call.
This blocks until the video is ready (typically 5-15 minutes). Prefer Async or Async with SSE for anything beyond quick experimentation.See the video generation reference for more details.
curl -X POST https://hub.oxen.ai/api/ai/videos/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OXEN_API_KEY" \
  -d '{
  "model": "kling-video-o3-4k-image-to-video",
  "prompt": "<prompt>",
  "input_image": "https://hub.oxen.ai/api/repos/elau/assets/file/main/bloxy/bloxy_cropped_512x512.png"
}'

Fetch model details

The models endpoint returns the full model object, including its json_request_schema.
curl -H "Authorization: Bearer $OXEN_API_KEY" https://hub.oxen.ai/api/ai/models/kling-video-o3-4k-image-to-video

Request parameters

Required parameters

FieldTypeDefaultDescription
promptstringText prompt describing the motion or action for the video.
input_imagestringStart frame to generate the video from. Format: uri.

Optional parameters

FieldTypeDefaultDescription
tail_image_urlstringFrame to end the generation on. Format: uri.
durationinteger5Length of the generated video in seconds. Range: 3 – 15.
generate_audiobooleanfalseGenerate synchronized native audio (English or Chinese) with the video.