Skip to main content

Try Qwen Image - 2512 in the Workbench

Run this model interactively, tune parameters, and compare outputs.
Model ID: qwen-image-2512 Qwen Image 2512 is a Large Vision Model. It excels in text-to-image generation with improved realism in human portraits, finer natural textures, and stronger text rendering, particularly for Chinese characters. Some other noteworthy use cases of Qwen Image 2512 include instruction-based image editing and generating structured visuals like posters or UI mockups.
MetricValue
Parameter Count20 billion
Mixture of ExpertsUnknown
Context LengthUnknown
MultilingualYes
Quantized*Unknown
*Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.

Example request

Use the Workbench as a request builder: configure parameters for this model in the UI, then open the API tab to copy the exact cURL or Python call.
See the image generation reference for more details.
curl -X POST https://hub.oxen.ai/api/ai/images/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OXEN_API_KEY" \
  -d '{
  "model": "qwen-image-2512",
  "prompt": "A bald eagle sitting on a vast frozen lake, centered in the shot facing the camera. The eagle is in it'\''s natural habitat and being photographed from a medium distance. The background is a vast lake surrounded by a forest going up a hill. Photorealistic - it is high enough quality that it could be used for a National Geographic cover, but is just a stand alone photo without any graphics."
}'

Fetch model details

The models endpoint returns the full model object, including its json_request_schema.
curl -H "Authorization: Bearer $OXEN_API_KEY" https://hub.oxen.ai/api/ai/models/qwen-image-2512

Request parameters

Required parameters

FieldTypeDefaultDescription
promptstring"A bald eagle sitting on a vast frozen lake, centered in the shot facing the camera. The eagle is in it's natural habitat and being photographed from a medium distance. The background is a vast lake surrounded by a forest going up a hill. Photorealistic - it is high enough quality that it could be used for a National Geographic cover, but is just a stand alone photo without any graphics."Prompt for generated image

Optional parameters

FieldTypeDefaultDescription
negative_promptstring" "Negative prompt for generated image
aspect_ratiostring"16:9"Aspect ratio for the generated image One of: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3.
image_sizestring"optimize_for_quality"Image size for the generated image One of: optimize_for_quality, optimize_for_speed.
num_inference_stepsinteger30Number of denoising steps. Recommended range is 28-50, and lower number of steps produce lower quality outputs, faster. Range: 1 – 50.
guidancenumber3Guidance for generated image. Lower values can give more realistic images. Good values to try are 2, 2.5, 3 and 3.5 Range: 0 – 10.
seedintegerRandom seed. Set for reproducible generation
output_formatstring"webp"Format of the output images One of: webp, jpg, png.
output_qualityinteger80Quality when saving the output images, from 0 to 100. 100 is best quality, 0 is lowest quality. Not relevant for .png outputs Range: 0 – 100.
disable_safety_checkerbooleanfalseDisable safety checker for generated images.