Seedance 2.0 ลด 15% | สร้างใน Video Generator →

Wan 2.6 Text to Image

alibaba /

WAN 2.6 Text-to-Image generates high-quality images from natural-language prompts with strong prompt adherence and clean composition. It supports multiple aspect ratios and size control, seed-based reproducibility, and flexible styles (photorealistic to illustrative) for ads, product shots, and social visuals. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing.

text-to-image
อินพุต
width
height
1024 × 1024 px
Range: 768 - 1440
If set to true, the prompt optimizer will be enabled.

ว่าง

An extreme close-up documentary shot of a human face in brutal Arctic cold, eyelashes completely frozen and coated in thick ice crystals, frozen breath crystallizing in the air, skin slightly red from negative 50°C temperatures, hyper-realistic cinematic lighting, shallow depth of field, every frost particle sharply detailed, realistic cold blue color tones, shot on an ARRI Alexa 65 with a macro lens, natural film grain, Netflix-style documentary realism.

$0.03ต่อครั้ง·~33 / $1

ต่อไป:

ตัวอย่างดูทั้งหมด

An extreme close-up documentary shot of a human face in brutal Arctic cold, eyelashes completely frozen and coated in thick ice crystals, frozen breath crystallizing in the air, skin slightly red from negative 50°C temperatures, hyper-realistic cinematic lighting, shallow depth of field, every frost particle sharply detailed, realistic cold blue color tones, shot on an ARRI Alexa 65 with a macro lens, natural film grain, Netflix-style documentary realism.

An extreme close-up documentary shot of a human face in brutal Arctic cold, eyelashes completely frozen and coated in thick ice crystals, frozen breath crystallizing in the air, skin slightly red from negative 50°C temperatures, hyper-realistic cinematic lighting, shallow depth of field, every frost particle sharply detailed, realistic cold blue color tones, shot on an ARRI Alexa 65 with a macro lens, natural film grain, Netflix-style documentary realism.

a small girl with black twin-tail hair, sitting with her legs drawn together in front of her, smoking a cigarette, angel wings attached to her back, gently fluttering, flat solid gray background, no gradient, uniform monochrome, 3D pixel art style, voxel art, blocky geometry, anime-style character design, stylized proportions, minimal facial detail, low-resolution yet three-dimensional pixels, minimalistic composition, quiet and subdued mood, slightly surreal atmosphere, cinematic framing, soft but gloomy lighting --ar 58:77 --video 1

a small girl with black twin-tail hair, sitting with her legs drawn together in front of her, smoking a cigarette, angel wings attached to her back, gently fluttering, flat solid gray background, no gradient, uniform monochrome, 3D pixel art style, voxel art, blocky geometry, anime-style character design, stylized proportions, minimal facial detail, low-resolution yet three-dimensional pixels, minimalistic composition, quiet and subdued mood, slightly surreal atmosphere, cinematic framing, soft but gloomy lighting --ar 58:77 --video 1

Jumping wolf motif that is one colour. The wolf is in similar style as Jankovics Marcell's Fehérlófia. As the wolf body looks like as flames. the wolf, standing in a snowy mountain landscape, minimalist ink sketch style, black and white only, sharp eyes, calm but tense posture, hand-drawn animation look, no fur details, abstract form, high contrast, rough texture --ar 1:1

Jumping wolf motif that is one colour. The wolf is in similar style as Jankovics Marcell's Fehérlófia. As the wolf body looks like as flames. the wolf, standing in a snowy mountain landscape, minimalist ink sketch style, black and white only, sharp eyes, calm but tense posture, hand-drawn animation look, no fur details, abstract form, high contrast, rough texture --ar 1:1

dark fantasy 1980s DVD screengrab of a crusader raising his sword in a traditional early middle ages church ar 3:2 --ar 1:1

dark fantasy 1980s DVD screengrab of a crusader raising his sword in a traditional early middle ages church ar 3:2 --ar 1:1

A modern tea shop interior, warm afternoon light, minimalist wood design, cinematic photography, medium shot, shallow depth of field, 35mm look, clean lines, natural shadows, soft highlights, cozy seating, neatly arranged tea bar, high detail

Negative prompt: blurry, low-res, watermark, text, logo, cluttered background, overexposed, underexposed, distortion, fisheye, noise

A modern tea shop interior, warm afternoon light, minimalist wood design, cinematic photography, medium shot, shallow depth of field, 35mm look, clean lines, natural shadows, soft highlights, cozy seating, neatly arranged tea bar, high detail Negative prompt: blurry, low-res, watermark, text, logo, cluttered background, overexposed, underexposed, distortion, fisheye, noise

A mix collage with rapper, diamond, concert, neons, scratch paper, lyrics on paper, racing cars, money, and girls with a futuristic vibe

A mix collage with rapper, diamond, concert, neons, scratch paper, lyrics on paper, racing cars, money, and girls with a futuristic vibe

โมเดลที่เกี่ยวข้อง

README

Wan 2.6 Text-to-Image

Wan 2.6 Text-to-Image (/wan-2.6/text-to-image) is ’s text-to-image generation model for creating high-quality visuals from a single natural-language prompt. It’s built for practical creative workflows—concept art, product visuals, portraits, and stylized imagery—where you want strong prompt adherence plus flexible custom sizing.

Why it stands out

  • Fast, one-shot text-to-image generation Generate an image in a single run for quick ideation and production workflows.

  • Custom width × height output Set width and height directly (within the endpoint’s limits) to match banners, thumbnails, posters, or social formats.

  • Prompt expansion for better results Enable prompt expansion to automatically enrich short prompts with useful detail for more coherent compositions.

  • Seeded iteration Use a fixed seed to refine style and layout with more repeatable variations.

Parameters

ParameterDescription
prompt*Text description of the image you want to generate.
widthOutput width (within allowed limits).
heightOutput height (within allowed limits).
enable_prompt_expansionToggle prompt expansion to enrich short prompts.
seedSet a fixed seed for more repeatable iterations (-1 for random).

How to use

  1. Write a clear prompt (subject + setting + style).
  2. Choose width and height that match your target aspect ratio.
  3. Turn on enable_prompt_expansion if your prompt is short or under-specified.
  4. Set a seed if you want repeatable iterations (keep the same seed while you tweak the prompt).
  5. Click Run, review the result, and iterate.

Prompt tips

  • Start with subject + environment + style: “A modern tea shop interior, warm afternoon light, minimalist wood design, cinematic photography.”
  • Add camera / composition when framing matters: “wide shot, shallow depth of field, 35mm film look.”
  • Keep instructions positive and specific (what you want to see, not what you fear).

Pricing

  • $0.03 per generated image

Notes

  • Output sizing is limited by the endpoint’s current constraints (for example, width/height bounds and aspect-ratio limits). If a size fails, reduce resolution or choose a more standard aspect ratio.
  • Enabling prompt expansion can improve quality for short prompts, but may add a little latency.
  • Returned image URLs may be time-limited—save outputs if you need long-term storage.

Related Models

  • Wan 2.5 Text-to-Image — A proven Wan text-to-image model for reliable, cost-stable AI image generation with a similar prompt-first workflow.
  • Seedream V4 Text-to-Image — A style-consistent text-to-image generator for posters, campaigns, and high-volume brand-friendly illustration batches.
  • FLUX.2 Turbo Edit — A fast natural-language image editing model for precise image-to-image transformations, brand color control, and iterative creative revisions.
  • Google Nano Banana Pro Edit — High-fidelity prompt-based image editing for composition-preserving changes, product visuals, and reliable on-image text handling.
การเข้าถึง:เว็บไซต์นี้ใช้โมเดล AI ที่จัดหาโดยบุคคลที่สาม

Wan 2.6 Text To Image API — Quick start

Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/alibaba/wan-2.6/text-to-image with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Wan 2.6 Text To Image below.

HTTP example
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/alibaba/wan-2.6/text-to-image" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{
    "prompt": "A cinematic shot of a city at sunset, soft golden light",
    "size": "1024*1024",
    "enable_prompt_expansion": false,
    "seed": -1
}'

# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# When status is "completed", read the output from data.outputs[0].
Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');

const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env

const result = await client.run("alibaba/wan-2.6/text-to-image", {
        "prompt": "A cinematic shot of a city at sunset, soft golden light",
        "size": "1024*1024",
        "enable_prompt_expansion": false,
        "seed": -1
});

console.log(result.outputs[0]); // → URL of the generated output
Python example
# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "alibaba/wan-2.6/text-to-image",
    {
    "prompt": "A cinematic shot of a city at sunset, soft golden light",
    "size": "1024*1024",
    "enable_prompt_expansion": false,
    "seed": -1
}
)

print(output["outputs"][0])  # → URL of the generated output

Wan 2.6 Text To Image API — Frequently asked questions

What is the Wan 2.6 Text To Image API?

Wan 2.6 Text To Image is a Alibaba model for image generation, exposed as a REST API on WaveSpeedAI. WAN 2.6 Text-to-Image generates high-quality images from natural-language prompts with strong prompt adherence and clean composition. It supports multiple aspect ratios and size control, seed-based reproducibility, and flexible styles (photorealistic to illustrative) for ads, product shots, and social visuals. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing. You can call it programmatically or try it from the playground above.

How do I call the Wan 2.6 Text To Image API?

POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/alibaba/alibaba-wan-2.6-text-to-image.

How much does Wan 2.6 Text To Image cost per run?

Wan 2.6 Text To Image starts at $0.030 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.

What inputs does Wan 2.6 Text To Image accept?

Key inputs: `prompt`, `size`, `seed`, `enable_prompt_expansion`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/alibaba/alibaba-wan-2.6-text-to-image.

How long does Wan 2.6 Text To Image take to generate?

Average end-to-end generation time on WaveSpeedAI is around 9 seconds per request — measured across recent runs. Queue time scales with global demand; live status is visible in the prediction record.

Can I use Wan 2.6 Text To Image outputs commercially?

Commercial usage rights depend on the model's license, set by its provider (Alibaba). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.