Jib Mix Qwen Image Text to Image | High-Quality Text-to-Image API

Home/Explore/WaveSpeed/Jib Mix Qwen Image/Text To Image

wavespeed-ai /

Jib Mix Qwen is a next-gen Text-to-Image model optimized for producing natural, pretty faces with improved Asian facial rendering. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-image

Input

prompt*

size

width

height

1024 × 1024 px

Range: 256 - 1536

seed

output_format

enable_sync_mode

If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

enable_base64_output

If enabled, the output will be encoded into a BASE64 string instead of a URL. This property is only available through the API.

Enable Safety Checker

Idle

An Impressionist oil painting of a woman sitting in a sun-dappled garden, holding a parasol, in the style of Claude Monet. Visible, broken brushstrokes. The focus is on the fleeting play of light and shadow on her white dress and the vibrant, dappled colors of the flowers. Soft focus, bright, airy atmosphere.

$0.02per run·~50 / $1

An Impressionist oil painting of a woman sitting in a sun-dappled garden, holding a parasol, in the style of Claude Monet. Visible, broken brushstrokes. The focus is on the fleeting play of light and shadow on her white dress and the vibrant, dappled colors of the flowers. Soft focus, bright, airy atmosphere.

A surreal oil painting of a woman’s face emerging from a swirl of colors, expressive brushstrokes, high contrast lighting, inspired by Van Gogh and Klimt, artistic depth.

A beautiful anime girl sitting under cherry blossoms, soft glowing light, expressive eyes, delicate hair strands, cinematic composition, Makoto Shinkai style, pastel tones, high detail.

A highly detailed 3D character render, stylized realism. A young female adventurer with large, expressive teal eyes and freckles. Messy copper hair. Wearing worn leather armor. Soft, flattering studio lighting. Flawless skin shading (subsurface scattering). Trending on Artstation, Unreal Engine 5, ZBrush, Pixar style.

A highly detailed Steampunk portrait of a female airship captain. Wearing an ornate Victorian dress combined with a leather corset and utility belt. Intricate brass goggles pushed up on her tophat. A complex clockwork mechanical arm. Background of gears and steam pipes. Warm sepia tones, imaginative, complex machinery.

A 1940s black and white film noir portrait. A mysterious woman (femme fatale) wearing a stylish hat with a veil. Her face is half-hidden in deep shadow. Dramatic "Venetian blind" shadows fall across her face and the wall behind her. Smoking a cigarette, smoke curling in the air. High contrast, grainy film texture, mysterious, chiaroscuro lighting.

An ultra-realistic, gritty portrait of a cyberpunk hacker. Her face is illuminated by the glowing blue and pink neon signs of a futuristic city street at night. Rain-slicked trench coat, intricate cybernetic implants on her cheek, intense, focused eyes. High contrast, sharp focus, digital art, cinematic atmosphere, style of Blade Runner.

An ethereal watercolor portrait of a red-haired woman, eyes closed, face tilted up. Delicate green ivy and tiny mushrooms are woven into her braided hair. Soft, earthy pastels (moss green, terracotta), translucent washes, fine ink outlines. A peaceful, listening expression, as if hearing the forest. Isolated on a white background, whimsical Art Nouveau illustration.

A minimalist single-line (one-line) drawing of a person's face in profile. The entire portrait is created with one continuous, flowing black line on a plain white background. Elegant, simple, clean. Captures the essence of the form with no shading. Style of a modern tattoo design or Matisse's line drawings.

A Baroque-style oil painting portrait of a nobleman with a thoughtful expression. In the style of Caravaggio. Dramatic chiaroscuro, with a single light source illuminating him against a pitch-black background (Tenebrism). Rich texture in his velvet robes and intricate lace collar. Deep, rich colors, masterful brushstrokes, profound and theatrical.

An Art Deco portrait of a glamorous 1920s woman. Sharp, geometric bob haircut. Wearing an extravagant, sequined dress and a pearl headpiece. Posing confidently against a background of stylized, metallic gold geometric patterns (like the Chrysler Building). Bold lines, lavish ornamentation, strong symmetry, sophisticated and modern.

A delicate watercolor portrait of a young woman in profile, her long silver hair flowing and blending with wisps of translucent clouds. Soft moonlight illuminates her face, adorned with tiny, glowing crescent moon symbols. Soft pastel blues and lavenders, elegant flowing lines, a serene and mystical expression. Clean white background, in the Art Nouveau style of Alphonse Mucha.

Related Models

z-image/turbo-inpaint

image-to-image

qwen3-tts/voice-design

text-to-audio

qwen3-tts/text-to-speech

text-to-audio

qwen3-tts/voice-clone

audio-to-audio

qwen-image-max/edit

image-to-image

qwen-image-max/text-to-image

text-to-image

README

Jib-Mix-Qwen-Image (Text-to-Image)

Jib-Mix-Qwen-Image is a finely tuned text-to-image generation model based on Qwen-Image 20B (MMDiT), optimized through the Jib-Mix portrait enhancement pipeline. It specializes in realistic human faces, cinematic lighting, and vivid artistic styles, delivering professional-grade visuals from simple text prompts — no LoRA setup needed.

Why it looks great

Jib-Mix fine-tuning – Enhances facial structure, skin texture, and lighting realism, especially for close-ups and half-body portraits.
Cinematic diffusion engine – Captures lifelike depth, atmosphere, and tone with consistent color harmony.
Exceptional text rendering – Handles both Chinese and English typography natively, blending text naturally into the image.
Broad style coverage – From photorealism to anime, oil painting, 3D, or stylized artwork—one model, infinite versatility.
Identity consistency – Generates characters with coherent facial details and stable expressions across prompts.

Limits and Performance

Max resolution per job: up to 1536 × 1536 pixels
Output formats: JPEG / PNG / WEBP
Processing speed: ~5–8 seconds per image (depending on prompt complexity)
Prompt input: supports detailed, multi-line bilingual descriptions

Pricing

$0.02 per image Each image is billed individually.

How to Use

Enter a prompt describing your desired image (Chinese or English).
Set image size (width × height, up to 1536×1536).
(Optional) Set a seed for reproducibility (-1 = random).
Choose output format (JPEG / PNG / WEBP).
Generate → preview → iterate with refined prompts.

Pro tips for best quality

Be specific — describe lighting, pose, emotion, and background for more control.
For portraits, include keywords like cinematic lighting, soft focus, 8K detail, professional photo.
Fix seed to maintain subject consistency across multiple outputs.
Experiment with styles (e.g., realistic, anime, oil painting, CG render) to explore model versatility.

Note

For best realism, ensure prompts describe camera angle, lighting, and environment — the model responds strongly to cinematic cues.

Accessibility:This website uses AI models provided by third parties.

ExamplesView all