Try LTX 2.3 Quality: Image to Video in the Workbench
Run this model interactively, tune parameters, and compare outputs.
ltx-2-3-quality-image-to-video
LTX 2.3 Quality (Image to Video) is the high-quality preset of Lightricks LTX-2.3 on fal, animating a starting image into video with synchronized native audio guided by a text prompt. It runs a distilled DiT workflow with a quality preset control, supporting up to 481 frames at 1 to 60 FPS and flexible output resolutions.
The model conditions on the input image with an adjustable image strength, where higher values keep the first frame closer to the source and lower values give the model more freedom. Audio (sound effects, ambient noise, and dialogue) is generated alongside the visuals in a single pass and can be disabled to return a silent MP4. It is well suited for bringing still images to life with cinematic motion and matching sound.
| Metric | Value |
|---|---|
| Parameter Count | 22 billion |
| Mixture of Experts | No |
| Context Length | Unknown |
| Multilingual | Unknown |
| Quantized* | Unknown |
Example request
- Sync
- Async
- Async with SSE
This blocks until the video is ready (typically 5-15 minutes). Prefer Async or Async with SSE for anything beyond quick experimentation.See the video generation reference for more details.
- Minimal
- All parameters
Fetch model details
The models endpoint returns the full model object, including itsjson_request_schema.
Request parameters
Required parameters
| Field | Type | Default | Description |
|---|---|---|---|
prompt | string | — | The prompt to guide the video generation. |
input_image | string | — | The URL of the starting image. Format: uri. |
Optional parameters
| Field | Type | Default | Description |
|---|---|---|---|
num_frames | integer | 121 | The number of frames to generate. Range: 9 – 481. |
resolution | string | "auto" | The size of the generated video. ‘auto’ derives the size from the input image aspect ratio. |
frames_per_second | number | 24 | Frames per second of the generated video. Range: 1 – 60. |
image_strength | number | 0.7 | Conditioning strength on the start image. 1.0 = exact first-frame match, lower = more freedom for the model. Range: 0 – 1. |
generate_audio | boolean | true | Whether to include audio in the returned video. When disabled, the final MP4 is returned without an audio track. |
video_quality | string | "high" | The quality preset of the generated video. One of: low, medium, high, maximum. |
negative_prompt | string | "color distortion, overexposure, static, blurry details, subtitles, style, artwork, painting, frame, still, dim overall tone, worst quality, low quality, JPEG compression artifacts, ugly, mutilated, extra fingers, poorly drawn hands, poorly drawn face, deformed, disfigured, malformed limbs, fused fingers, motionless frame, cluttered background, three legs, crowded background, walking backwards" | The negative prompt to steer generation away from. |
seed | integer | — | Random seed for reproducibility. If None, a random seed is chosen. |