Skip to main content

Try Seedance 2.0 - Text to Video in the Workbench

Run this model interactively, tune parameters, and compare outputs.
Model ID: bytedance-seedance-2-0-text-to-video ByteDance Seedance 2.0 (Pro) generates video from a text prompt. Supports 480p, 720p, and 1080p output, durations from 4-15 seconds, multiple aspect ratios, and optional synchronized audio.

Example request

Use the Workbench as a request builder: configure parameters for this model in the UI, then open the API tab to copy the exact cURL or Python call.
This blocks until the video is ready (typically 5-15 minutes). Prefer Async or Async with SSE for anything beyond quick experimentation.See the video generation reference for more details.
curl -X POST https://hub.oxen.ai/api/ai/videos/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OXEN_API_KEY" \
  -d '{
  "model": "bytedance-seedance-2-0-text-to-video",
  "prompt": "<prompt>"
}'

Fetch model details

The models endpoint returns the full model object, including its json_request_schema.
curl -H "Authorization: Bearer $OXEN_API_KEY" https://hub.oxen.ai/api/ai/models/bytedance-seedance-2-0-text-to-video

Request parameters

Required parameters

FieldTypeDefaultDescription
promptstringThe text prompt used to generate the video.

Optional parameters

FieldTypeDefaultDescription
resolutionstring"720p"Video resolution. 480p for faster generation, 720p for balance, 1080p for highest quality. One of: 480p, 720p, 1080p.
durationinteger-1Duration of the video in seconds. Supports 4 to 15 seconds, or auto to let the model decide based on the prompt. One of: -1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15.
aspect_ratiostring"adaptive"The aspect ratio of the generated video. Use 16:9 for landscape, 9:16 for portrait/vertical, 1:1 for square, 21:9 for ultrawide cinematic, or auto to let the model decide. One of: adaptive, 21:9, 16:9, 4:3, 1:1, 3:4, 9:16.
generate_audiobooleanfalseWhether to generate synchronized audio for the video, including sound effects, ambient sounds, and lip-synced speech.
watermarkbooleanfalseWhether to add an ‘AI generated’ watermark to the output.