Try Veo 3.0 in the Workbench
Run this model interactively, tune parameters, and compare outputs.
google-veo-3
google/veo-3 is a video generation model designed for text- and image-conditioned video creation with native, synchronized audio.
It excels in generating short, realistic clips with coherent motion, prompt-aligned cinematography, and integrated audio (dialogue, ambient sound, and effects), making it useful for tasks like concept visualization, social content, and pre-visualization for storytellers and filmmakers.
Some other noteworthy use cases of google/veo-3 include creating marketing and educational videos from scripts or reference images, and producing character-driven scenes that maintain visual and audio consistency across shots.
| Metric | Value |
|---|---|
| Parameter Count | Unknown |
| Mixture of Experts | Unknown |
| Context Length | Unknown |
| Multilingual | Yes |
| Quantized* | Unknown |
Example request
- Sync
- Async
- Async with SSE
This blocks until the video is ready (typically 5-15 minutes). Prefer Async or Async with SSE for anything beyond quick experimentation.See the video generation reference for more details.
- Minimal
- Basic parameters
- All parameters
Fetch model details
The models endpoint returns the full model object, including itsjson_request_schema.
Request parameters
Required parameters
| Field | Type | Default | Description |
|---|---|---|---|
prompt | string | "A dog digging on the beach" | Text description of what you want to generate, or the instruction on how to edit the given image. |
Optional parameters
| Field | Type | Default | Description |
|---|---|---|---|
input_image | string | — | Image to use as reference. Must be jpeg, png, gif, or webp. Format: uri. |
aspect_ratio | string | "16:9" | Video aspect ratio One of: 9:16, 16:9. |
duration | integer | 8 | Video duration in seconds One of: 4, 6, 8. |
resolution | string | "1080p" | Resolution of the generated video One of: 1080p, 720p. |
generate_audio | boolean | false | Generate audio with the video. |