Try Kling O3 Pro - Reference to Video in the Workbench
Run this model interactively, tune parameters, and compare outputs.
kling-video-o3-pro-reference-to-video
Part of the Kling 3.0/o3 family exclusively available on fal.ai, this reference-to-video model transforms static reference images into dynamic video sequences. It excels at preserving image details like identity, layout, and text while adding realistic motion, camera movements, and scene progression based on cinematic prompts. Supports multi-shot generation, explicit motion instructions, flexible durations up to 15 seconds, and native audio when specified. Optimized for advertising, branded content, and realistic scene extensions with smooth transitions and narrative continuity.
Example request
- Sync
- Async
- Async with SSE
This blocks until the video is ready (typically 5-15 minutes). Prefer Async or Async with SSE for anything beyond quick experimentation.See the video generation reference for more details.
- Minimal
- Basic parameters
- All parameters
Fetch model details
The models endpoint returns the full model object, including itsjson_request_schema.
Request parameters
Optional parameters
| Field | Type | Default | Description |
|---|---|---|---|
start_image_url | string | — | The first frame of the video. The model will try to extend the contents of this frame. Format: uri. |
tail_image_url | string | — | The last frame of the video. Requires start frame to be configured. The model will try to fill in between the frames. Format: uri. |
input_image | array<string> | — | Reference images for style/appearance. Use @Image1, @Image2, etc. in the prompt to refer to them. Maximum 4 total (elements + reference images) when using video. |
elements | array<object> | — | Optional element references. Use @Element1, @Element2, etc. in the prompt to refer to them. |
negative_prompt | string | — | Text describing what to avoid in the generated video. |
aspect_ratio | string | "16:9" | Video aspect ratio One of: 9:16, 1:1, 16:9. |
generate_audio | boolean | false | Whether to generate native audio for the video. Supports Chinese and English voice output. |