Try Seedance 2.0 - Image to Video in the Workbench
Run this model interactively, tune parameters, and compare outputs.
bytedance-seedance-2-0-image-to-video
ByteDance Seedance 2 image-to-video animates a starting frame from a text motion prompt, with optional end-frame control for transitions. It supports 480p, 720p, or 1080p output, durations from 4–15 seconds or automatic length from the prompt, multiple aspect ratios (including auto from the input image), and synchronized audio (effects, ambience, and lip-synced speech).
Example request
- Sync
- Async
- Async with SSE
This blocks until the video is ready (typically 5-15 minutes). Prefer Async or Async with SSE for anything beyond quick experimentation.See the video generation reference for more details.
- Minimal
- Basic parameters
- All parameters
Fetch model details
The models endpoint returns the full model object, including itsjson_request_schema.
Request parameters
Required parameters
| Field | Type | Default | Description |
|---|---|---|---|
prompt | string | — | The text prompt describing the desired motion and action for the video. |
input_image | string | — | The URL of the starting frame image to animate. Supported formats: JPEG, PNG, WebP. Max 30 MB. Format: uri. |
Optional parameters
| Field | Type | Default | Description |
|---|---|---|---|
tail_image_url | string | — | The URL of the image to use as the last frame of the video. When provided, the generated video will transition from the starting image to this ending image. Supported formats: JPEG, PNG, WebP. Max 30 MB. Format: uri. |
resolution | string | "720p" | Video resolution - 480p for faster generation, 720p for balance, 1080p for highest quality. One of: 480p, 720p, 1080p. |
duration | string | "auto" | Duration of the video in seconds. Supports 4 to 15 seconds, or auto to let the model decide based on the prompt. One of: auto, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15. |
aspect_ratio | string | "auto" | The aspect ratio of the generated video. Use 16:9 for landscape, 9:16 for portrait/vertical, 1:1 for square, 21:9 for ultrawide cinematic, or auto to infer from the input image. One of: auto, 21:9, 16:9, 4:3, 1:1, 3:4, 9:16. |
generate_audio | boolean | true | Whether to generate synchronized audio for the video, including sound effects, ambient sounds, and lip-synced speech. |
seed | integer | — | Random seed for reproducibility. Note that results may still vary slightly even with the same seed. |