Home/Explore/Best Image Tool/wavespeed-ai/qwen-image/text-to-image

text-to-image

wavespeed-ai/qwen-image/text-to-image

Qwen-Image — a 20B MMDiT model for next-gen text-to-image generation.

width
height
If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.
If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.

Idle

Bookstore window display. A sign displays "New Arrivals This Week". Below, a shelf tag with the text "Best-Selling Novels Here". To the side, a colorful poster advertises "Author Meet And Greet on Saturday" with a central portrait of the author. There are four books on the bookshelf, namely "The light between worlds" "When stars are scattered" "The slient patient" "The night circus"

Your request will cost $0.02 per run.

For $1 you can run this model approximately 50 times.

One more thing:

ExamplesView all

A beautiful Chinese woman wearing a "WaveSpeedAI" T-shirt is smiling at the camera with a black marker. Behind her, a glass panel reads in handwriting, "Meet Qwen Image - a powerful image foundation model capable of complex text rendering and precise image editing."
Bookstore window display. A sign displays "New Arrivals This Week". Below, a shelf tag with the text "Best-Selling Novels Here". To the side, a colorful poster advertises "Author Meet And Greet on Saturday" with a central portrait of the author. There are four books on the bookshelf, namely "The light between worlds" "When stars are scattered" "The slient patient" "The night circus"
A man in a suit is standing in front of the window, looking at the bright moon outside the window. The man is holding a yellowed paper with handwritten words on it: "A lantern moon climbs through the silver night, Unfurling quiet dreams across the sky, Each star a whispered promise wrapped in light, That dawn will bloom, though darkness wanders by." There is a cute cat on the windowsill.
A Victorian noble lady with an elegant updo and a gentle gaze, wearing a deep red velvet dress, sitting in an ornate library. Warm candlelight illuminates her face and the surrounding bookshelves. In the style of John Singer Sargent, classic oil painting, expressive brushstrokes, masterpiece, rich textures.
A movie poster. The first row is the movie title, which reads "Imagination Unleashed". The second row is the movie subtitle, which reads "Enter a world beyond your imagination". The third row reads "Cast: Qwen-Image". The fourth row reads "Director: The Collective Imagination of Humanity". The central visual features a sleek, futuristic computer from which radiant colors, whimsical creatures, and dynamic, swirling patterns explosively emerge, filling the composition with energy, motion, and surreal creativity. The background transitions from dark, cosmic tones into a luminous, dreamlike expanse, evoking a digital fantasy realm. At the bottom edge, the text "Launching in the Cloud, August 2025" appears in bold, modern sans-serif font with a glowing, slightly transparent effect, evoking a high-tech, cinematic aesthetic. The overall style blends sci-fi surrealism with graphic design flair—sharp contrasts, vivid color grading, and layered visual depth—reminiscent of visionary concept art and digital matte painting, 32K resolution, ultra-detailed.
Real style, three different looking puppies have a camera in front of them and the puppies look at it curiously. Elevated view
A female athlete with defined muscles and a tight ponytail, preparing for a run. She is wearing a black sports top and leggings, her gaze focused and determined. The background is a city running track at dawn with a light mist on the ground. Dynamic action shot, strong rim lighting outlining her silhouette, powerful and energetic, high contrast.
A girl with little freckles and messy red hair sitting on a rooftop during sunset, denim jacket slightly worn, holding a Polaroid camera, city skyline glowing in soft hues behind her
An elven queen with long silver hair and glowing blue eyes, wearing a magnificent white gown adorned with jewels. She stands in an ancient, mystical forest surrounded by luminous plants and mist. Moonlight filtering through the canopy, creating magical light and shadows. Fantasy art, epic, intricate details, masterpiece, digital painting.

README

Qwen-Image (Text-to-Image)

Qwen-Image is a 20B MMDiT-based text-to-image generation model, especially strong at native text rendering in both English and Chinese. It is a powerful creative tool for posters, comics, and visual storytelling, while also excelling at general image generation from photorealism to anime.

Why it looks great

  • SOTA text rendering: Rivals GPT-4o in English and best-in-class for Chinese.
  • In-pixel text generation: Text is fully integrated into the image (no overlays).
  • Bilingual typography: Handles diverse fonts, styles, and complex layouts.
  • General image capability: Excels across styles—photorealistic, anime, impressionist, minimalist.

Limits and Performance

  • Max resolution per job: up to 1536 × 1536 pixels
  • Custom size: manually set width & height
  • Output formats: JPEG / PNG / WEBP
  • Processing speed: ~5–8 seconds per image (depends on size & queue)
  • Input prompt: supports detailed, multi-line descriptions

Price

Only $0.02 per image!!!

How to Use

  1. Write a prompt describing the image (can include embedded text).
  2. Adjust size (width & height, up to 1536×1536).
  3. Set a seed for reproducibility.
  4. Choose output_format.
  5. Run the job and download the generated image.

Pro tips for best quality

  • For poster design, explicitly describe font style, placement, and mood.
  • For bilingual text, specify both Chinese and English in the prompt.
  • Use consistent seeds to regenerate similar layouts with slight variations.
  • Keep height:width ratio balanced for best typography results.