Uno

Uno

Playground

Try it on WavespeedAI!

An AI model that transforms input images into new ones based on text prompts, blending reference visuals with your creative directions.

Features

UNO – Universal In‑Context Diffusion Transformer 📸

A powerful subject-driven image synthesis model (developed by ByteDance Research) enabling both single-subject and multi-subject image generation with high consistency and controllability using diffusion transformers.

Implementation ✨

This model leverages a two-stage progressive cross‑modal alignment strategy, combined with Universal Rotary Position Embedding (UnoPE):

  1. Stage I: Fine-tune a pretrained T2I (text-to-image) model using generated single-subject in-context data.
  2. Stage II: Further train on multi-subject paired data to support scenes with multiple specified subjects. :contentReference[oaicite:1]{index=1}

Highlights:

  • Built on Diffusion Transformers (DiT) with FLUX.1-dev backbone
  • UnoPE maintains subject identity and reduces confusion across multiple subjects :contentReference[oaicite:2]{index=2}
  • Input: 1–4 reference images + text prompt
  • Output: synthesized image reflecting consistent subject(s) in context

Key Features

  • High-consistency, multi-subject generation—preserves unique subject traits across images :contentReference[oaicite:3]{index=3}
  • 🔁 Single → multi subject scaling via staged training
  • 🔧 Controllable layout and reference identity handling
  • 📐 Handles varying aspect ratios and resolutions (512–704px+) :contentReference[oaicite:4]{index=4}

Predictions Examples 🌟

  • Generating images of the same person in different settings
  • Placing multiple consistent products or characters in a single scene
  • Virtual try-on and identity-preserving e-commerce renders

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/uno" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "images": [
        "https://d3gnftk2yhz9lr.wavespeed.ai/media/f82c31a9020c49459ec9e57f26d0a22f/images/1747919962008035794_JwNJFBxu.png",
        "https://d1q70pf5vjeyhc.wavespeed.ai/media/images/1753248290087456291_L3upsplh.png",
        "https://d1q70pf5vjeyhc.wavespeed.ai/media/images/1753248294014875604_lAPMIGEA.png"
    ],
    "image_size": "square_hd",
    "prompt": "A woman wears the dress and holds a bag, in the flowers.",
    "num_images": 1,
    "num_inference_steps": 28,
    "guidance_scale": 3.5,
    "output_format": "jpeg",
    "enable_safety_checker": true
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
imagesarrayYeshttps://d3gnftk2yhz9lr.wavespeed.ai/media/f82c31a9020c49459ec9e57f26d0a22f/images/1747919962008035794_JwNJFBxu.pnghttps://d1q70pf5vjeyhc.wavespeed.ai/media/images/1753248290087456291_L3upsplh.pnghttps://d1q70pf5vjeyhc.wavespeed.ai/media/images/1753248294014875604_lAPMIGEA.png-URL of images to use while generating the image.
image_sizestringNosquare_hd-The aspect ratio of the generated image.
promptstringYes-The prompt to generate an image from.
seedintegerNo--1 ~ 2147483647The random seed to use for the generation.
num_imagesintegerNo11 ~ 4The number of images to generate.
num_inference_stepsintegerNo281 ~ 50The number of inference steps to perform.
guidance_scalenumberNo3.51 ~ 20The guidance scale to use for the generation.
output_formatstringNojpeg-The format of the output image.
enable_safety_checkerbooleanNotrue-If set to true, the safety checker will be enabled.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Query Parameters

Result Request Parameters

ParameterTypeRequiredDefaultDescription
idstringYes-Task ID

Result Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
dataobjectThe prediction data object containing all details
data.idstringUnique identifier for the prediction, the ID of the prediction to get
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds
© 2025 WaveSpeedAI. All rights reserved.