Try Wan2.1 14B - Text to Video in the Workbench
Run this model interactively, tune parameters, and compare outputs.
wan-ai-wan2-1-t2v-14b-diffusers
Wan-AI/Wan2.1-T2V-14B-Diffusers is a 14B parameter Large Vision Model (LVM) designed for high-fidelity text-to-video and image-to-video generation, including support for readable text in English and Chinese within generated videos.
It excels in generating temporally consistent videos from detailed prompts, offers customizable aspect ratios, and maintains stability at both 480p and 720p resolutions across consumer-grade hardware.
Some other noteworthy features of Wan-AI/Wan2.1-T2V-14B-Diffusers include prompt enhancement for improved video quality and precision, inspiration mode for artistic visual enrichment, sound effects generation, and efficient video encoding via a variational autoencoder (VAE).
| Metric | Value |
|---|---|
| Parameter Count | 14 billion |
| Mixture of Experts | No |
| Context Length | Unknown |
| Multilingual | Yes |
| Quantized* | No |
Example request
- Sync
- Async
- Async with SSE
This blocks until the video is ready (typically 5-15 minutes). Prefer Async or Async with SSE for anything beyond quick experimentation.See the video generation reference for more details.
- Minimal
- All parameters
Fetch model details
The models endpoint returns the full model object, including itsjson_request_schema.
Request parameters
Required parameters
| Field | Type | Default | Description |
|---|---|---|---|
prompt | string | "A beautiful landscape painting of a serene lake with mountains in the background and an ox in the foreground." | Prompt for generated image |
Optional parameters
| Field | Type | Default | Description |
|---|---|---|---|
height | integer | 480 | Height of the video Range: 1 – 720. |
width | integer | 832 | Width of the video Range: 1 – 1280. |
negative_prompt | string | " " | Negative prompt for generated image |
num_inference_steps | integer | 16 | Number of diffusion steps to take Range: 1 – 100. |
num_frames | integer | 81 | Number of frames of video to generate Range: 1 – 120. |
guidance_scale | number | 5.0 | Guidance for generated video. Lower values can give more realistic videos. Range: 0 – 10. |
seed | integer | — | Random seed. Set for reproducible generation |