Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.oxen.ai/llms.txt

Use this file to discover all available pages before exploring further.

Try WAN 2.7 - Reference to Video in the Workbench

Run this model interactively, tune parameters, and compare outputs.
Model ID: wan-v2-7-reference-to-video WAN 2.7 reference-to-video generates video from one or more reference images and reference videos with optional first-frame joint control, supporting single-character performances, multi-character interactions, and multi-shot narration. Up to 5 reference images and reference videos combined; reference identifiers in the prompt (“Image 1”, “Image 2”, “Video 1”, …) match the order of assets within each type. Output up to 1080P at 2-15s (2-10s when a reference video is included).

Example request

Use the Workbench as a request builder: configure parameters for this model in the UI, then open the API tab to copy the exact cURL or Python call.
This blocks until the video is ready (typically 5-15 minutes). Prefer Async or Async with SSE for anything beyond quick experimentation.See the video generation reference for more details.
curl -X POST https://hub.oxen.ai/api/ai/videos/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OXEN_API_KEY" \
  -d '{
  "model": "wan-v2-7-reference-to-video",
  "prompt": "Image 1 walks through a beautiful garden in the style of Image 2, cinematic lighting."
}'

Fetch model details

The models endpoint returns the full model object, including its json_request_schema.
curl -H "Authorization: Bearer $OXEN_API_KEY" https://hub.oxen.ai/api/ai/models/wan-v2-7-reference-to-video

Request parameters

Required parameters

FieldTypeDefaultDescription
promptstring"Image 1 walks through a beautiful garden in the style of Image 2, cinematic lighting."Text prompt describing the desired video. Supports Chinese and English. Max 5000 characters. Use ‘Image 1, Image 2, …’ to reference reference_images in order, and ‘Video 1, Video 2, …’ to reference reference_videos in order; identifiers are independent across types.

Optional parameters

FieldTypeDefaultDescription
reference_imagesarray<object>Array of reference images for character/object/scene appearance. Each item has a URL and an optional reference voice. Order maps to ‘Image 1’, ‘Image 2’, etc. Reference images + reference videos must total ≤ 5. JPEG/JPG/PNG/BMP/WEBP, 240-8000 px per side, aspect ratio 1:8 to 8:1, max 20 MB each.
reference_videosarray<object>Array of reference videos for character/object appearance, motion, and voice. Each item has a URL and an optional reference voice. Order maps to ‘Video 1’, ‘Video 2’, etc. Reference images + reference videos must total ≤ 5. MP4/MOV, 1-30s, 240-4096 px per side, aspect ratio 1:8 to 8:1, max 100 MB each.
input_imagestringOptional first-frame image used for joint control. Provides a starting frame the video is generated from. JPEG/JPG/PNG/BMP/WEBP, 240-8000 px per side, max 20 MB. When provided, the output aspect ratio is taken from this image and the aspect_ratio parameter is ignored. Format: uri.
aspect_ratiostring"16:9"Aspect ratio of the generated video. Ignored when a first frame image is provided (the model uses the input asset’s ratio). One of: 16:9, 9:16, 1:1, 4:3, 3:4.
resolutionstring"1080P"Output video resolution tier. One of: 720P, 1080P.
durationinteger5Output video duration in seconds. 2-15 with reference images only; 2-10 when any reference video is included. Range: 2 – 15.
negative_promptstringContent to avoid in the video. Supports Chinese and English. Max 500 characters.
prompt_extendbooleantrueWhether the model rewrites short prompts to improve quality. Adds processing time.
watermarkbooleanfalseAdds an ‘AI Generated’ watermark to the bottom-right corner.
seedintegerRandom seed for reproducibility (0-2147483647). Range: 0 – 2147483647.