Home/Explore/Qwen AI Models/wavespeed-ai/qwen-image/text-to-image-lora

text-to-image

Qwen-Image LoRA | 20B MMDiT Text To Image With LoRA Support | WaveSpeedAI

wavespeed-ai/qwen-image/text-to-image-lora

Qwen-Image LoRA is a 20B MMDiT next-gen text-to-image model with LoRA support for fast customization and refined image generation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

width
height
If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.
If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.

Idle

Valentin in a natural daylight selfie at a cafe entrance. He looks seriously into the camera, wearing a black coat or jacket and wireless earbud. Background includes wooden frames, warm pendant lights, and urban cafe details. With text "WaveSpeedAI"

Your request will cost $0.025 per run.

For $1 you can run this model approximately 40 times.

One more thing::

ExamplesView all

Valentin in a natural daylight selfie at a cafe entrance. He looks seriously into the camera, wearing a black coat or jacket and wireless earbud. Background includes wooden frames, warm pendant lights, and urban cafe details. With text "WaveSpeedAI"
realism, a female inventor with auburn hair in an intricate updo and goggles on her head, her eyes full of intellect. She wears a leather corset and a multi-layered skirt, standing in her workshop. The room is filled with brass gears, complex clockwork devices, and glowing vacuum tubes. Warm light from gas lamps illuminates the scene. Steampunk style, highly detailed, retro-futurism, masterpiece.
A glamorous woman with a sharp bob haircut and dark lipstick. She is dressed in a stunning black and gold sequined flapper dress with long pearls. She leans against a gilded Art Deco bar, with a jazz band softly blurred in the background. Sophisticated, low-key lighting creates a luxurious and intimate mood, Great Gatsby era, glamorous, geometric patterns.
A majestic Afrofuturist queen with magnificent braided hair adorned with golden rings and cybernetic circuits. She wears a vibrant robe that merges traditional Kente patterns with glowing energy lines. The background is a futuristic African metropolis with unique architecture and flying vehicles. Vibrant, vivid colors, sci-fi art, character portrait.
A close-up portrait of a stylish woman with wavy, dark brown long hair and a warm smile, wearing a beige cashmere sweater. The background is a blurred city street with a soft bokeh effect. Natural afternoon light, cinematic, photorealistic, high detail, 8K.
realism, a young woman sitting alone in a laundromat at midnight, wearing headphones, staring at the rotating dryer drum, neon reflections on the glass, a subtle expression of nostalgia on her face
realism, a woman with healthy tanned skin and natural long curly hair, with a few wildflowers woven into it. She wears a fringed, off-white linen dress and sits barefoot in a golden field at sunset, holding a guitar. The lighting is warm and soft, creating a free-spirited and romantic atmosphere, photorealistic, golden hour lighting.
real life anime, a woman with curly hair tied loosely, wearing a paint-stained oversized white shirt, barefoot, standing in a spacious industrial loft with large windows and exposed brick walls. She’s holding a large brush, working on a colorful abstract canvas. Natural light pouring in, art supplies scattered around, expressive, richly detailed scene.
realism, a woman like a mermaid, with flowing, long, blue hair and shimmering scales. She swims gracefully in clear tropical waters filled with coral and strange marine life. Sunlight penetrates the water's surface, creating moving beams of light that illuminate the entire scene—dreamy, vibrant, light and shadow effects, underwater photography, highly detailed.
realism, a young scholar with glasses, wearing a tweed blazer, sits in a grand, ancient library. Sunlight streams through a massive arched window, illuminating dust motes dancing in the air. An open book rests on her lap as she looks up thoughtfully. Warm and cozy atmosphere, light academia aesthetic, narrative lighting, photorealistic.
A resilient female survivor with wind-swept short hair and a determined gaze. She wears patched-up leather gear and tactical equipment, holding a modified staff. She stands on a hill overlooking the ruins of a city at dusk, against a dramatic orange sky. Cinematic, post-apocalyptic style, realism, atmospheric lighting, wide-angle shot.

README

Qwen-Image-LoRA

Qwen-Image-LoRA extends the base 20B MMDiT text-to-image model by allowing users to plug in custom LoRA weights (.safetensors) for fine-tuned control over style, characters, or artistic domains. This makes it a versatile tool for creators who want both world-class text rendering and personalized generation.

Why it looks great

  • LoRA integration: Import external .safetensors LoRA weights and control blending strength via scale.
  • SOTA text rendering: Rivals GPT-4o in English and is best-in-class for Chinese typography.
  • In-pixel text generation: Text is seamlessly integrated into images (no overlays).
  • Bilingual support: Handles Chinese & English with diverse fonts and complex layouts.
  • General image excellence: Photorealistic, anime, impressionist, or minimalist styles—all supported.

Limits and Performance

  • Max resolution per job: up to 1024 × 1024 pixels
  • LoRA path: provide <owner>/<model-name> or external .safetensors URL
  • LoRA scale: adjustable strength (default = 1.0)
  • Output formats: JPEG / PNG / WEBP
  • Processing speed: ~6–10 seconds per image
  • Input prompt: supports multi-line descriptive text

Pricing

  • $0.025 per image
  • Each image is billed individually.

How to Use

  1. Enter a prompt (supports detailed narrative & embedded text).

  2. Set size (width & height, up to 1024×1024).

  3. Add one or more LoRAs:

    • Paste the path/URL of the LoRA .safetensors file.
    • Adjust the scale (e.g., 0.5 for subtle effect, 1.0 for full strength).
  4. (Optional) Set seed for reproducibility (-1 = random).

  5. Choose output format (JPEG / PNG).

  6. Run → preview results → iterate with different LoRA scales.

Pro tips for best quality

  • Use specific LoRAs for characters, art styles, or IP consistency.
  • Combine multiple LoRAs for hybrid results (e.g., anime + steampunk).
  • Adjust scale carefully—too high may distort, too low may fade.
  • Lock the seed to maintain subject consistency when swapping LoRAs.

Reference

Note