Wan 2.1 Text To Image

Playground

Wan 2.1 Text-to-Image delivers ultra-realistic photographic images by adapting the Wan 2.1 video model for SOTA visual fidelity. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

wan-2.1/text-to-image

Wan 2.1 is part of the Wan 2.1 foundation model suite, an advanced AI system developed to redefine video and image generation. This model focuses on text-to-image synthesis — transforming detailed written prompts into vivid, high-resolution visuals with cinematic precision.

🌟 Key Features

🎨 SOTA Image Quality Built on Wan 2.1’s next-generation video foundation, this model produces exceptional still-frame quality with realistic lighting, texture, and depth.
🧠 Multilingual Understanding Supports both Chinese and English prompts, ensuring accurate and context-rich image generation across languages.
⚙️ Fine Control with Parameters Adjustable inputs such as strength, width, and height provide creators with direct control over composition and style.
🪄 Powerful Visual Consistency Based on Wan-VAE, enabling coherent detail, color fidelity, and stylistic alignment across resolutions.
💰 Lightweight and Efficient High-quality generation at a base cost of just $0.02 per image, ideal for scalable creative workflows.

⚙️ Parameters

Parameter	Description
prompt*	Text description of the image to be generated (supports CN/EN).
image	(Optional) Upload a reference image for guided generation.
strength	Controls how strongly the image follows the prompt or reference (0–1).
size (width / height)	Define custom output resolution; max recommended ratio 2:1.
seed	Fix for reproducibility or randomize for variation.
output_format	Choose from `jpeg`, `png`, or `webp`.

💡 Example Prompt

Envision an ethereal and highly decorative portrait of an androgynous Elven Monarch, seated upon a throne carved from living iridescent wood within a moonlit glade. Intricate Art Nouveau details, luminous textures, soft-focus background, cinematic lighting.

💰 Pricing

Metric	Price
Per image generated	$0.02 / image

🎯 Use Cases

Concept Art & Illustration — Generate fantasy, sci-fi, or cinematic character art.
Visual Design & Branding — Create unique imagery for marketing, web, or product visuals.
Research & Visualization — Produce clear, detailed concept visuals from descriptive text.
Previsualization — Generate cinematic stills for film, animation, or game design workflows.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/wan-2.1/text-to-image" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "strength": 0.6,
    "size": "1024*1024",
    "seed": -1,
    "output_format": "jpeg",
    "enable_base64_output": false,
    "enable_sync_mode": false
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
prompt	string	Yes		-	The positive prompt for the generation.
image	string	No		-	The image to generate an image from (optional).
strength	number	No	0.6	0.00 ~ 1.00	Strength indicates extent to transform the reference image.
size	string	No	1024*1024	256 ~ 1536 per dimension	The size of the generated media in pixels (width*height).
seed	integer	No	-1	-1 ~ 2147483647	The random seed to use for the generation. -1 means a random seed will be used.
output_format	string	No	jpeg	jpeg, png, webp	The format of the output image.
enable_base64_output	boolean	No	false	-	If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.
enable_sync_mode	boolean	No	false	-	If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	string	Array of URLs to the generated content (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Wan 2.1 T2V 720p Ultra Fast Wan 2.1 Text To Image LoRA