Nano Banana 2 & Pro Sale — 15% OFF | Apr 1–15 Only
Home/Explore/Wan 2.7 Models/alibaba/wan-2.7/text-to-image
text-to-image

text-to-image

Alibaba WAN 2.7

alibaba/wan-2.7/text-to-image

Alibaba WAN 2.7 Text-to-Image generates high-quality images from text prompts with thinking mode for enhanced image quality. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

Input
width
height
2048 × 2048 px
Range: 512 - 4096
Enable thinking mode for enhanced reasoning and better image quality. Increases generation time.

Idle

Close-up portrait of a model whose face is partially covered in flowing liquid metal or an iridescent, second-skin-like substance. She has otherworldly, light purple eyes and stares directly into the camera. The background is completely blurred out, leaving only a soft halo of light. The lighting is even and ethereal, as if from a bioluminescent source. Inspired by the style of Nick Knight, the image emphasizes surreal textures and subtle color gradients, exceptionally sharp, with breathtaking detail, 16K.\n

Your request will cost $0.03 per run.

For $1 you can run this model approximately 33 times.

One more thing:

ExamplesView all

a group of animals standing in line to buy coffee, side view, anthropomorphic animals, a dog, a cat, a raccoon and a rabbit waiting in a queue, holding coffee cups, modern coffee shop counter, barista in background, casual daily scene, natural behavior, soft morning light, realistic environment, cinematic composition, 35mm photography, shallow depth of field, warm tones, high detail, ultra realistic
Close-up portrait of a model whose face is partially covered in flowing liquid metal or an iridescent, second-skin-like substance. She has otherworldly, light purple eyes and stares directly into the camera. The background is completely blurred out, leaving only a soft halo of light. The lighting is even and ethereal, as if from a bioluminescent source. Inspired by the style of Nick Knight, the image emphasizes surreal textures and subtle color gradients, exceptionally sharp, with breathtaking detail, 16K.\n
A mix collage with rapper, diamond, concert, neons, scratch paper, lyrics on paper, racing cars, money, and girls with a futuristic vibe
A fair-skinned model with classical beauty, lounging on a velvet chaise lounge, surrounded by old books and withered roses. She is wearing a baroque-style lace gown, her expression is languid and contemplative. The scene is a dim, old library, with a single stream of Rembrandt-style light from a side window illuminating her face and figure. Composition inspired by a John William Waterhouse painting, rich in narrative. The overall tones are deep and heavy, with strong chiaroscuro, creating an oil painting texture and detail.

README

Alibaba Wan 2.7 Text-to-Image

Alibaba Wan 2.7 Text-to-Image (alibaba/wan-2.7/text-to-image) is Alibaba's latest text-to-image generation model with built-in thinking mode for enhanced reasoning and higher-quality outputs. It's designed for creative and production workflows—concept art, product visuals, portraits, and stylized imagery—where you want strong prompt adherence, flexible sizing, and smarter composition powered by chain-of-thought reasoning.

Why it stands out

  • Thinking mode for smarter generation Built-in thinking mode enables the model to reason about prompt intent before generating, producing more coherent compositions and better prompt adherence.

  • Fast, one-shot text-to-image generation Generate an image in a single run for quick ideation and production workflows.

  • Custom size output Set output size directly (512–4096 per dimension) to match banners, thumbnails, posters, or social formats. Total pixels must be between 768×768 and 2048×2048, with aspect ratio between 1:8 and 8:1.

  • Seeded iteration Use a fixed seed to refine style and layout with more repeatable variations.

Parameters

ParameterDescription
prompt*Text description of the image you want to generate.
sizeOutput size in pixels (widthheight). Range: 512–4096 per dimension. Default: 10241024. Total pixels: 768×768–2048×2048. Aspect ratio: 1:8–8:1.
thinking_modeEnable thinking mode for enhanced reasoning and better image quality (default: true). Increases generation time.
seedSet a fixed seed for more repeatable iterations (-1 for random).

How to use

  1. Write a clear prompt (subject + setting + style).
  2. Choose a size that matches your target aspect ratio (e.g. 10241024, 10241536, 1536*1024).
  3. Leave thinking_mode enabled (default) for best quality, or disable it for faster generation.
  4. Set a seed if you want repeatable iterations (keep the same seed while you tweak the prompt).
  5. Click Run, review the result, and iterate.

Prompt tips

  • Start with subject + environment + style: "A modern tea shop interior, warm afternoon light, minimalist wood design, cinematic photography."
  • Add camera / composition when framing matters: "wide shot, shallow depth of field, 35mm film look."
  • Keep instructions positive and specific (what you want to see, not what you fear).
  • With thinking mode enabled, the model handles short or ambiguous prompts better—but detailed prompts still yield the best results.

Pricing

  • $0.03 per generated image

Notes

  • Output size is 512–4096 pixels per dimension. Total pixels must be between 768×768 and 2048×2048, with aspect ratio between 1:8 and 8:1.
  • Thinking mode is enabled by default and improves quality, but adds some latency. Disable it if speed is the priority.
  • Returned image URLs may be time-limited—save outputs if you need long-term storage.

Related Models