Uno

Uno

Playground

Try it on WavespeedAI!

Uno AI transforms input images into new visuals guided by text prompts, blending reference images with your creative directions for precise, style-aware edits. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

UNO – Universal In‑Context Diffusion Transformer 📸

A powerful subject-driven image synthesis model (developed by ByteDance Research) enabling both single-subject and multi-subject image generation with high consistency and controllability using diffusion transformers.

Implementation ✨

This model leverages a two-stage progressive cross‑modal alignment strategy, combined with Universal Rotary Position Embedding (UnoPE):

  1. Stage I: Fine-tune a pretrained T2I (text-to-image) model using generated single-subject in-context data.
  2. Stage II: Further train on multi-subject paired data to support scenes with multiple specified subjects. :contentReference[oaicite:1]{index=1}

Highlights:

  • Built on Diffusion Transformers (DiT) with FLUX.1-dev backbone
  • UnoPE maintains subject identity and reduces confusion across multiple subjects :contentReference[oaicite:2]{index=2}
  • Input: 1–4 reference images + text prompt
  • Output: synthesized image reflecting consistent subject(s) in context

Key Features

  • High-consistency, multi-subject generation—preserves unique subject traits across images :contentReference[oaicite:3]{index=3}
  • 🔁 Single → multi subject scaling via staged training
  • 🔧 Controllable layout and reference identity handling
  • 📐 Handles varying aspect ratios and resolutions (512–704px+) :contentReference[oaicite:4]{index=4}

Predictions Examples 🌟

  • Generating images of the same person in different settings
  • Placing multiple consistent products or characters in a single scene
  • Virtual try-on and identity-preserving e-commerce renders

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/uno" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "image_size": "square_hd",
    "num_images": 1,
    "num_inference_steps": 28,
    "guidance_scale": 3.5,
    "output_format": "jpeg"
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
imagesarrayYes[]-URL of images to use while generating the image.
image_sizestringNosquare_hdsquare_hd, square, portrait_4_3, portrait_16_9, landscape_4_3, landscape_16_9The aspect ratio of the generated image.
promptstringYes-The positive prompt for the generation.
seedintegerNo--1 ~ 2147483647The random seed to use for the generation.
num_imagesintegerNo11 ~ 4The number of images to generate.
num_inference_stepsintegerNo281 ~ 50The number of inference steps to perform.
guidance_scalenumberNo3.51 ~ 20The guidance scale to use for the generation.
output_formatstringNojpegjpeg, pngThe format of the output image.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Request Parameters

© 2025 WaveSpeedAI. All rights reserved.