Qwen Image Text to Image LoRA | Custom LoRA Image API

Home/Explore/WaveSpeed/Qwen Image/Text To Image Lora

wavespeed-ai /

Qwen-Image LoRA is a 20B MMDiT next-gen text-to-image model with LoRA support for fast customization and refined image generation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

lora-support

Input

prompt*

size

width

height

1024 × 1024 px

Range: 256 - 1536

loras

pathscale

seed

output_format

enable_sync_mode

If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

enable_base64_output

If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.

Enable Safety Checker

Idle

Valentin in a natural daylight selfie at a cafe entrance. He looks seriously into the camera, wearing a black coat or jacket and wireless earbud. Background includes wooden frames, warm pendant lights, and urban cafe details. With text "WaveSpeedAI"

$0.025per run·~40 / $1

ExamplesView all

Valentin in a natural daylight selfie at a cafe entrance. He looks seriously into the camera, wearing a black coat or jacket and wireless earbud. Background includes wooden frames, warm pendant lights, and urban cafe details. With text "WaveSpeedAI"

realism, a female inventor with auburn hair in an intricate updo and goggles on her head, her eyes full of intellect. She wears a leather corset and a multi-layered skirt, standing in her workshop. The room is filled with brass gears, complex clockwork devices, and glowing vacuum tubes. Warm light from gas lamps illuminates the scene. Steampunk style, highly detailed, retro-futurism, masterpiece.

A glamorous woman with a sharp bob haircut and dark lipstick. She is dressed in a stunning black and gold sequined flapper dress with long pearls. She leans against a gilded Art Deco bar, with a jazz band softly blurred in the background. Sophisticated, low-key lighting creates a luxurious and intimate mood, Great Gatsby era, glamorous, geometric patterns.

A majestic Afrofuturist queen with magnificent braided hair adorned with golden rings and cybernetic circuits. She wears a vibrant robe that merges traditional Kente patterns with glowing energy lines. The background is a futuristic African metropolis with unique architecture and flying vehicles. Vibrant, vivid colors, sci-fi art, character portrait.

A close-up portrait of a stylish woman with wavy, dark brown long hair and a warm smile, wearing a beige cashmere sweater. The background is a blurred city street with a soft bokeh effect. Natural afternoon light, cinematic, photorealistic, high detail, 8K.

realism, a young woman sitting alone in a laundromat at midnight, wearing headphones, staring at the rotating dryer drum, neon reflections on the glass, a subtle expression of nostalgia on her face

realism, a woman with healthy tanned skin and natural long curly hair, with a few wildflowers woven into it. She wears a fringed, off-white linen dress and sits barefoot in a golden field at sunset, holding a guitar. The lighting is warm and soft, creating a free-spirited and romantic atmosphere, photorealistic, golden hour lighting.

real life anime, a woman with curly hair tied loosely, wearing a paint-stained oversized white shirt, barefoot, standing in a spacious industrial loft with large windows and exposed brick walls. She’s holding a large brush, working on a colorful abstract canvas. Natural light pouring in, art supplies scattered around, expressive, richly detailed scene.

realism, a woman like a mermaid, with flowing, long, blue hair and shimmering scales. She swims gracefully in clear tropical waters filled with coral and strange marine life. Sunlight penetrates the water's surface, creating moving beams of light that illuminate the entire scene—dreamy, vibrant, light and shadow effects, underwater photography, highly detailed.

realism, a young scholar with glasses, wearing a tweed blazer, sits in a grand, ancient library. Sunlight streams through a massive arched window, illuminating dust motes dancing in the air. An open book rests on her lap as she looks up thoughtfully. Warm and cozy atmosphere, light academia aesthetic, narrative lighting, photorealistic.

A resilient female survivor with wind-swept short hair and a determined gaze. She wears patched-up leather gear and tactical equipment, holding a modified staff. She stands on a hill overlooking the ruins of a city at dusk, against a dramatic orange sky. Cinematic, post-apocalyptic style, realism, atmospheric lighting, wide-angle shot.

Related Models

z-image/turbo-inpaint

image-to-image

qwen3-tts/voice-design

text-to-audio

qwen3-tts/text-to-speech

text-to-audio

qwen3-tts/voice-clone

audio-to-audio

qwen-image-max/edit

image-to-image

qwen-image-max/text-to-image

text-to-image

README

Qwen-Image-LoRA

Qwen-Image-LoRA extends the base 20B MMDiT text-to-image model by allowing users to plug in custom LoRA weights (.safetensors) for fine-tuned control over style, characters, or artistic domains. This makes it a versatile tool for creators who want both world-class text rendering and personalized generation.

Why it looks great

LoRA integration: Import external .safetensors LoRA weights and control blending strength via scale.
SOTA text rendering: Rivals GPT-4o in English and is best-in-class for Chinese typography.
In-pixel text generation: Text is seamlessly integrated into images (no overlays).
Bilingual support: Handles Chinese & English with diverse fonts and complex layouts.
General image excellence: Photorealistic, anime, impressionist, or minimalist styles—all supported.

Limits and Performance

Max resolution per job: up to 1024 × 1024 pixels
LoRA path: provide <owner>/<model-name> or external .safetensors URL
LoRA scale: adjustable strength (default = 1.0)
Output formats: JPEG / PNG / WEBP
Processing speed: ~6–10 seconds per image
Input prompt: supports multi-line descriptive text

Pricing

$0.025 per image
Each image is billed individually.

How to Use

Enter a prompt (supports detailed narrative & embedded text).
Set size (width & height, up to 1024×1024).
Add one or more LoRAs:

Paste the path/URL of the LoRA .safetensors file.
Adjust the scale (e.g., 0.5 for subtle effect, 1.0 for full strength).

(Optional) Set seed for reproducibility (-1 = random).
Choose output format (JPEG / PNG).
Run → preview results → iterate with different LoRA scales.

Pro tips for best quality

Use specific LoRAs for characters, art styles, or IP consistency.
Combine multiple LoRAs for hybrid results (e.g., anime + steampunk).
Adjust scale carefully—too high may distort, too low may fade.
Lock the seed to maintain subject consistency when swapping LoRAs.

Reference

Note

Please use wavespeed-ai/qwen-image-lora-trainer to make sure your LoRA can use in this model!
Or the corresponding model from official platform! (Civitai or Hugging Face)

Accessibility:This website uses AI models provided by third parties.