Uno
Playground
Try it on WavespeedAI!Uno AI transforms input images into new visuals guided by text prompts, blending reference images with your creative directions for precise, style-aware edits. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Features
UNO – Universal In-Context Diffusion Transformer
UNO is a subject-driven image generation model from ByteDance Research. It takes a small set of reference images plus a text prompt and synthesizes new scenes where the same subjects re-appear with high identity consistency and strong style control. It works for both single-subject and multi-subject prompts.
What UNO is good at
-
Subject-consistent generation Keep the same person, character, or product recognizable across new scenes and poses.
-
Single → multi-subject scenes Start from one subject or combine several references into a coherent group image.
-
Layout & style control Use the prompt and image_size to steer framing, setting, and visual mood while preserving identity.
-
Flexible aspect ratios Supports portrait, landscape, and square formats suitable for thumbnails, posts, key art, and ads.
Input Parameters
images (required)
1–5 reference images of your subject(s). These define identity, clothing, and overall look.
- Use multiple angles or expressions for better robustness.
- You can mix people, products, or characters, as long as the prompt makes their roles clear.
prompt (required)
Text description of the scene you want to generate, for example:
- “Santa Claus is standing in front of the Christmas tree.”
- “Two cartoon astronauts posing on the moon, product bottle in the center.”
UNO will combine the prompt with your references to place the subjects into the requested scene.
image_size
Controls aspect ratio and framing:
- square_hd – high-res square
- square – standard square
- portrait_4_3, portrait_16_9
- landscape_4_3, landscape_16_9
Choose based on where the image will be used (feed post, story, banner, thumbnail, etc.).
seed
Randomness control:
- Empty / unset → a random seed each time.
- Any integer → reproducible output for the same settings.
num_images
Number of images to generate per run (e.g., 1–4). Higher values give more options at once.
num_inference_steps
Number of diffusion steps (e.g., around 20–30 by default):
- Fewer steps → faster, slightly less detailed.
- More steps → slower, more refined and stable.
guidance_scale
Classifier-free guidance strength:
- Lower values → more creative, looser interpretation of the prompt.
- Higher values → closer adherence to the prompt and reference identity.
output_format
File format of the generated images:
jpegpng
Designed For
- Character & IP creators – Keep mascots or VTuber avatars on-model across many scenes.
- Product & e-commerce teams – Generate consistent hero shots and lifestyle scenes for the same item.
- Brand & marketing – Multi-subject key art where specific people or products must stay recognizable.
- Concept artists – Rapidly explore compositions using a small library of reference looks.
How to Use
- Upload 1–5 images of your subject(s).
- Choose an image_size that matches your target placement (square, portrait, or landscape).
- Write a clear prompt describing the scene, style, and relationships between subjects.
- Optionally set seed, num_images, num_inference_steps, guidance_scale, and output_format.
- Run the model, review the generated images, and iterate by tweaking prompt or references to refine identity and style.
Pricing
- Per image just need $0.05!
- Total price is 0.05 * num_images.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/uno" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"image_size": "square_hd",
"num_images": 1,
"num_inference_steps": 28,
"guidance_scale": 3.5,
"output_format": "jpeg"
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| images | array | Yes | [] | - | URL of images to use while generating the image. |
| image_size | string | No | square_hd | square_hd, square, portrait_4_3, portrait_16_9, landscape_4_3, landscape_16_9 | The aspect ratio of the generated media. |
| prompt | string | Yes | - | The positive prompt for the generation. | |
| seed | integer | No | - | -1 ~ 2147483647 | The random seed to use for the generation. |
| num_images | integer | No | 1 | 1 ~ 4 | The number of images to generate. |
| num_inference_steps | integer | No | 28 | 1 ~ 50 | The number of inference steps to perform. |
| guidance_scale | number | No | 3.5 | 1 ~ 20 | The guidance scale to use for the generation. |
| output_format | string | No | jpeg | jpeg, png | The format of the output image. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |