Giảm 50% mô hình Vidu Q3 & Q3 Pro · Chỉ trên WaveSpeedAI | 20/5 – 2/6

Qwen Image Text to Image

wavespeed-ai /

Qwen-Image is a 20B MMDiT next-gen text-to-image model that generates images from text prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-image
Input
width
height
1024 × 1024 px
Range: 256 - 1536
If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.
If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.

Idle

Bookstore window display. A sign displays "New Arrivals This Week". Below, a shelf tag with the text "Best-Selling Novels Here". To the side, a colorful poster advertises "Author Meet And Greet on Saturday" with a central portrait of the author. There are four books on the bookshelf, namely "The light between worlds" "When stars are scattered" "The slient patient" "The night circus"

$0.02per run·~50 / $1

Next:

ExamplesView all

A beautiful Chinese woman wearing a "WaveSpeedAI" T-shirt is smiling at the camera with a black marker. Behind her, a glass panel reads in handwriting, "Meet Qwen Image - a powerful image foundation model capable of complex text rendering and precise image editing."

A beautiful Chinese woman wearing a "WaveSpeedAI" T-shirt is smiling at the camera with a black marker. Behind her, a glass panel reads in handwriting, "Meet Qwen Image - a powerful image foundation model capable of complex text rendering and precise image editing."

Bookstore window display. A sign displays "New Arrivals This Week". Below, a shelf tag with the text "Best-Selling Novels Here". To the side, a colorful poster advertises "Author Meet And Greet on Saturday" with a central portrait of the author. There are four books on the bookshelf, namely "The light between worlds" "When stars are scattered" "The slient patient" "The night circus"

Bookstore window display. A sign displays "New Arrivals This Week". Below, a shelf tag with the text "Best-Selling Novels Here". To the side, a colorful poster advertises "Author Meet And Greet on Saturday" with a central portrait of the author. There are four books on the bookshelf, namely "The light between worlds" "When stars are scattered" "The slient patient" "The night circus"

A man in a suit is standing in front of the window, looking at the bright moon outside the window. The man is holding a yellowed paper with handwritten words on it: "A lantern moon climbs through the silver night, Unfurling quiet dreams across the sky, Each star a whispered promise wrapped in light, That dawn will bloom, though darkness wanders by." There is a cute cat on the windowsill.

A man in a suit is standing in front of the window, looking at the bright moon outside the window. The man is holding a yellowed paper with handwritten words on it: "A lantern moon climbs through the silver night, Unfurling quiet dreams across the sky, Each star a whispered promise wrapped in light, That dawn will bloom, though darkness wanders by." There is a cute cat on the windowsill.

A Victorian noble lady with an elegant updo and a gentle gaze, wearing a deep red velvet dress, sitting in an ornate library. Warm candlelight illuminates her face and the surrounding bookshelves. In the style of John Singer Sargent, classic oil painting, expressive brushstrokes, masterpiece, rich textures.

A Victorian noble lady with an elegant updo and a gentle gaze, wearing a deep red velvet dress, sitting in an ornate library. Warm candlelight illuminates her face and the surrounding bookshelves. In the style of John Singer Sargent, classic oil painting, expressive brushstrokes, masterpiece, rich textures.

A movie poster. The first row is the movie title, which reads "Imagination Unleashed". The second row is the movie subtitle, which reads "Enter a world beyond your imagination". The third row reads "Cast: Qwen-Image". The fourth row reads "Director: The Collective Imagination of Humanity". The central visual features a sleek, futuristic computer from which radiant colors, whimsical creatures, and dynamic, swirling patterns explosively emerge, filling the composition with energy, motion, and surreal creativity. The background transitions from dark, cosmic tones into a luminous, dreamlike expanse, evoking a digital fantasy realm. At the bottom edge, the text "Launching in the Cloud, August 2025" appears in bold, modern sans-serif font with a glowing, slightly transparent effect, evoking a high-tech, cinematic aesthetic. The overall style blends sci-fi surrealism with graphic design flair—sharp contrasts, vivid color grading, and layered visual depth—reminiscent of visionary concept art and digital matte painting, 32K resolution, ultra-detailed.

A movie poster. The first row is the movie title, which reads "Imagination Unleashed". The second row is the movie subtitle, which reads "Enter a world beyond your imagination". The third row reads "Cast: Qwen-Image". The fourth row reads "Director: The Collective Imagination of Humanity". The central visual features a sleek, futuristic computer from which radiant colors, whimsical creatures, and dynamic, swirling patterns explosively emerge, filling the composition with energy, motion, and surreal creativity. The background transitions from dark, cosmic tones into a luminous, dreamlike expanse, evoking a digital fantasy realm. At the bottom edge, the text "Launching in the Cloud, August 2025" appears in bold, modern sans-serif font with a glowing, slightly transparent effect, evoking a high-tech, cinematic aesthetic. The overall style blends sci-fi surrealism with graphic design flair—sharp contrasts, vivid color grading, and layered visual depth—reminiscent of visionary concept art and digital matte painting, 32K resolution, ultra-detailed.

Real style, three different looking puppies have a camera in front of them and the puppies look at it curiously. Elevated view

Real style, three different looking puppies have a camera in front of them and the puppies look at it curiously. Elevated view

A female athlete with defined muscles and a tight ponytail, preparing for a run. She is wearing a black sports top and leggings, her gaze focused and determined. The background is a city running track at dawn with a light mist on the ground. Dynamic action shot, strong rim lighting outlining her silhouette, powerful and energetic, high contrast.

A female athlete with defined muscles and a tight ponytail, preparing for a run. She is wearing a black sports top and leggings, her gaze focused and determined. The background is a city running track at dawn with a light mist on the ground. Dynamic action shot, strong rim lighting outlining her silhouette, powerful and energetic, high contrast.

A girl with little freckles and messy red hair sitting on a rooftop during sunset, denim jacket slightly worn, holding a Polaroid camera, city skyline glowing in soft hues behind her

A girl with little freckles and messy red hair sitting on a rooftop during sunset, denim jacket slightly worn, holding a Polaroid camera, city skyline glowing in soft hues behind her

An elven queen with long silver hair and glowing blue eyes, wearing a magnificent white gown adorned with jewels. She stands in an ancient, mystical forest surrounded by luminous plants and mist. Moonlight filtering through the canopy, creating magical light and shadows. Fantasy art, epic, intricate details, masterpiece, digital painting.

An elven queen with long silver hair and glowing blue eyes, wearing a magnificent white gown adorned with jewels. She stands in an ancient, mystical forest surrounded by luminous plants and mist. Moonlight filtering through the canopy, creating magical light and shadows. Fantasy art, epic, intricate details, masterpiece, digital painting.

Related Models

README

Qwen-Image (Text-to-Image)

Qwen-Image is a 20B MMDiT-based text-to-image generation model, especially strong at native text rendering in both English and Chinese. It is a powerful creative tool for posters, comics, and visual storytelling, while also excelling at general image generation from photorealism to anime.

Why it looks great

  • SOTA text rendering: Rivals GPT-4o in English and best-in-class for Chinese.
  • In-pixel text generation: Text is fully integrated into the image (no overlays).
  • Bilingual typography: Handles diverse fonts, styles, and complex layouts.
  • General image capability: Excels across styles—photorealistic, anime, impressionist, minimalist.

Limits and Performance

  • Max resolution per job: up to 1536 × 1536 pixels
  • Custom size: manually set width & height
  • Output formats: JPEG / PNG / WEBP
  • Processing speed: ~5–8 seconds per image (depends on size & queue)
  • Input prompt: supports detailed, multi-line descriptions

Price

Only $0.02 per image!!!

How to Use

  1. Write a prompt describing the image (can include embedded text).
  2. Adjust size (width & height, up to 1536×1536).
  3. Set a seed for reproducibility.
  4. Choose output_format.
  5. Run the job and download the generated image.

Pro tips for best quality

  • For poster design, explicitly describe font style, placement, and mood.
  • For bilingual text, specify both Chinese and English in the prompt.
  • Use consistent seeds to regenerate similar layouts with slight variations.
  • Keep height:width ratio balanced for best typography results.
Accessibility:This website uses AI models provided by third parties.

Qwen Image Text To Image API — Quick start

Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/wavespeed-ai/qwen-image/text-to-image with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Qwen Image Text To Image below.

HTTP example
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/qwen-image/text-to-image" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{
    "prompt": "A cinematic shot of a city at sunset, soft golden light",
    "size": "1024*1024",
    "seed": -1,
    "output_format": "jpeg",
    "enable_sync_mode": false,
    "enable_base64_output": false
}'

# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# When status is "completed", read the output from data.outputs[0].
Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');

const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env

const result = await client.run("wavespeed-ai/qwen-image/text-to-image", {
        "prompt": "A cinematic shot of a city at sunset, soft golden light",
        "size": "1024*1024",
        "seed": -1,
        "output_format": "jpeg",
        "enable_sync_mode": false,
        "enable_base64_output": false
});

console.log(result.outputs[0]); // → URL of the generated output
Python example
# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "wavespeed-ai/qwen-image/text-to-image",
    {
    "prompt": "A cinematic shot of a city at sunset, soft golden light",
    "size": "1024*1024",
    "seed": -1,
    "output_format": "jpeg",
    "enable_sync_mode": false,
    "enable_base64_output": false
}
)

print(output["outputs"][0])  # → URL of the generated output

Qwen Image Text To Image API — Frequently asked questions

What is the Qwen Image Text To Image API?

Qwen Image Text To Image is a WaveSpeedAI model for image generation, exposed as a REST API on WaveSpeedAI. Qwen-Image is a 20B MMDiT next-gen text-to-image model that generates images from text prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.

How do I call the Qwen Image Text To Image API?

POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/wavespeed-ai/qwen-image-text-to-image.

How much does Qwen Image Text To Image cost per run?

Qwen Image Text To Image starts at $0.020 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.

What inputs does Qwen Image Text To Image accept?

Key inputs: `prompt`, `size`, `seed`, `enable_base64_output`, `enable_sync_mode`, `output_format`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/wavespeed-ai/qwen-image-text-to-image.

How long does Qwen Image Text To Image take to generate?

Average end-to-end generation time on WaveSpeedAI is around 24 seconds per request — measured across recent runs. Queue time scales with global demand; live status is visible in the prediction record.

Can I use Qwen Image Text To Image outputs commercially?

Commercial usage rights depend on the model's license, set by its provider (WaveSpeedAI). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.