Qwen-Image (Text-to-Image)
Qwen-Image is a 20B MMDiT-based text-to-image generation model, especially strong at native text rendering in both English and Chinese. It is a powerful creative tool for posters, comics, and visual storytelling, while also excelling at general image generation from photorealism to anime.
Why it looks great
- SOTA text rendering: Rivals GPT-4o in English and best-in-class for Chinese.
- In-pixel text generation: Text is fully integrated into the image (no overlays).
- Bilingual typography: Handles diverse fonts, styles, and complex layouts.
- General image capability: Excels across styles—photorealistic, anime, impressionist, minimalist.
Limits and Performance
- Max resolution per job: up to 1536 × 1536 pixels
- Custom size: manually set width & height
- Output formats: JPEG / PNG / WEBP
- Processing speed: ~5–8 seconds per image (depends on size & queue)
- Input prompt: supports detailed, multi-line descriptions
Price
Only $0.02 per image!!!
How to Use
- Write a prompt describing the image (can include embedded text).
- Adjust size (width & height, up to 1536×1536).
- Set a seed for reproducibility.
- Choose output_format.
- Run the job and download the generated image.
Pro tips for best quality
- For poster design, explicitly describe font style, placement, and mood.
- For bilingual text, specify both Chinese and English in the prompt.
- Use consistent seeds to regenerate similar layouts with slight variations.
- Keep height:width ratio balanced for best typography results.