Wan 2.1 Text To Image
Playground
Try it on WavespeedAI!Wan 2.1 Text-to-Image delivers ultra-realistic photographic images by adapting the Wan 2.1 video model for SOTA visual fidelity. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Features
wan-2.1/text-to-image
Wan 2.1 is part of the Wan 2.1 foundation model suite, an advanced AI system developed to redefine video and image generation. This model focuses on text-to-image synthesis — transforming detailed written prompts into vivid, high-resolution visuals with cinematic precision.
🌟 Key Features
-
🎨 SOTA Image Quality Built on Wan 2.1’s next-generation video foundation, this model produces exceptional still-frame quality with realistic lighting, texture, and depth.
-
🧠 Multilingual Understanding Supports both Chinese and English prompts, ensuring accurate and context-rich image generation across languages.
-
⚙️ Fine Control with Parameters Adjustable inputs such as
strength,width, andheightprovide creators with direct control over composition and style. -
🪄 Powerful Visual Consistency Based on Wan-VAE, enabling coherent detail, color fidelity, and stylistic alignment across resolutions.
-
💰 Lightweight and Efficient High-quality generation at a base cost of just $0.02 per image, ideal for scalable creative workflows.
⚙️ Parameters
| Parameter | Description |
|---|---|
| prompt* | Text description of the image to be generated (supports CN/EN). |
| image | (Optional) Upload a reference image for guided generation. |
| strength | Controls how strongly the image follows the prompt or reference (0–1). |
| size (width / height) | Define custom output resolution; max recommended ratio 2:1. |
| seed | Fix for reproducibility or randomize for variation. |
| output_format | Choose from jpeg, png, or webp. |
💡 Example Prompt
Envision an ethereal and highly decorative portrait of an androgynous Elven Monarch, seated upon a throne carved from living iridescent wood within a moonlit glade. Intricate Art Nouveau details, luminous textures, soft-focus background, cinematic lighting.
💰 Pricing
| Metric | Price |
|---|---|
| Per image generated | $0.02 / image |
🎯 Use Cases
- Concept Art & Illustration — Generate fantasy, sci-fi, or cinematic character art.
- Visual Design & Branding — Create unique imagery for marketing, web, or product visuals.
- Research & Visualization — Produce clear, detailed concept visuals from descriptive text.
- Previsualization — Generate cinematic stills for film, animation, or game design workflows.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/wan-2.1/text-to-image" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"strength": 0.6,
"size": "1024*1024",
"seed": -1,
"output_format": "jpeg",
"enable_base64_output": false,
"enable_sync_mode": false
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | The positive prompt for the generation. | |
| image | string | No | - | The image to generate an image from (optional). | |
| strength | number | No | 0.6 | 0.00 ~ 1.00 | Strength indicates extent to transform the reference image. |
| size | string | No | 1024*1024 | 256 ~ 1536 per dimension | The size of the generated media in pixels (width*height). |
| seed | integer | No | -1 | -1 ~ 2147483647 | The random seed to use for the generation. -1 means a random seed will be used. |
| output_format | string | No | jpeg | jpeg, png, webp | The format of the output image. |
| enable_base64_output | boolean | No | false | - | If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API. |
| enable_sync_mode | boolean | No | false | - | If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |