Skip to main content

Try LTX 2.3 Quality: Text to Video in the Workbench

Run this model interactively, tune parameters, and compare outputs.
Model ID: ltx-2-3-quality-text-to-video LTX 2.3 Quality (Text to Video) is the high-quality preset of Lightricks LTX-2.3 on fal, generating video with synchronized native audio directly from a text prompt. It runs a distilled DiT workflow with a quality preset control, supporting up to 481 frames at 1 to 60 FPS and flexible output resolutions. The model generates sound effects, ambient noise, and dialogue alongside the visuals in a single pass, with optional prompt expansion and a configurable quality preset (low, medium, high, maximum). It is well suited for cinematic clips, stylized scenes, and short-form content where audio-visual synchronization matters. Audio output can be disabled to return a silent MP4.
MetricValue
Parameter Count22 billion
Mixture of ExpertsNo
Context LengthUnknown
MultilingualUnknown
Quantized*Unknown
*Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.

Example request

Use the Workbench as a request builder: configure parameters for this model in the UI, then open the API tab to copy the exact cURL or Python call.
This blocks until the video is ready (typically 5-15 minutes). Prefer Async or Async with SSE for anything beyond quick experimentation.See the video generation reference for more details.
curl -X POST https://hub.oxen.ai/api/ai/videos/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OXEN_API_KEY" \
  -d '{
  "model": "ltx-2-3-quality-text-to-video",
  "prompt": "<prompt>"
}'

Fetch model details

The models endpoint returns the full model object, including its json_request_schema.
curl -H "Authorization: Bearer $OXEN_API_KEY" https://hub.oxen.ai/api/ai/models/ltx-2-3-quality-text-to-video

Request parameters

Required parameters

FieldTypeDefaultDescription
promptstringThe prompt to generate the video from.

Optional parameters

FieldTypeDefaultDescription
num_framesinteger121The number of frames to generate. Range: 9 – 481.
resolutionstring"landscape_16_9"The size of the generated video.
frames_per_secondnumber24Frames per second of the generated video. Range: 1 – 60.
generate_audiobooleantrueWhether to include audio in the returned video. When disabled, the final MP4 is returned without an audio track.
video_qualitystring"high"The quality preset of the generated video. One of: low, medium, high, maximum.
negative_promptstring"color distortion, overexposure, static, blurry details, subtitles, style, artwork, painting, frame, still, dim overall tone, worst quality, low quality, JPEG compression artifacts, ugly, mutilated, extra fingers, poorly drawn hands, poorly drawn face, deformed, disfigured, malformed limbs, fused fingers, motionless frame, cluttered background, three legs, crowded background, walking backwards"The negative prompt to steer generation away from.
seedintegerRandom seed for reproducibility. If None, a random seed is chosen.