Introducing WaveSpeedAI Z Image Base on WaveSpeedAI

Introducing Z-Image Base: The Ultimate Text-to-Image Foundation Model for Creative Control

The text-to-image AI landscape just got a powerful new contender. Z-Image Base, the 6-billion parameter foundation model from Alibaba’s Tongyi Lab (Tongyi-MAI), is now available on WaveSpeedAI. Unlike its distilled sibling Z-Image Turbo, this full-featured model delivers complete CFG (Classifier-Free Guidance) support and negative prompting capabilities—giving creators the precise control they need for professional-grade image generation.

What is Z-Image Base?

Z-Image Base is the non-distilled foundation version of Alibaba’s groundbreaking Z-Image model family. While Z-Image Turbo trades user control for blistering speed through distillation, Z-Image Base preserves the full generative capabilities that make fine-grained creative control possible.

Built on the innovative S3-DiT (Single-Stream Diffusion Transformer) architecture, Z-Image Base processes text and image tokens in a unified sequence rather than using separate streams. This architectural approach improves parameter utilization and simplifies cross-modal alignment, resulting in exceptional prompt adherence and photorealistic output quality.

The model family made waves in the AI community immediately upon release, surpassing 500,000 downloads within 24 hours and quickly topping the Hugging Face trending list. Z-Image earned the distinction of being the #1 open-source model on the Artificial Analysis Text-to-Image Leaderboard—a remarkable achievement for a 6-billion parameter model competing against systems many times its size.

Key Features

Full CFG Support and Negative Prompting

Unlike distilled models that “bake in” guidance during training, Z-Image Base provides complete classifier-free guidance control. This means you can:

Use negative prompts to explicitly exclude unwanted elements like “blurry, distorted, low quality”
Adjust guidance scale to balance prompt adherence with creative variation
Achieve precise control over the generation process that distilled models simply cannot offer

Reference Image Guidance

Provide an optional reference image to influence your generated output’s composition, style, or subject matter. The strength parameter (0-1) lets you fine-tune exactly how much the reference influences the result:

Lower values (0.2-0.4): Output closely follows the reference
Medium values (0.5-0.7): Balanced blend of reference and prompt
Higher values (0.8-1.0): Prompt dominates, reference serves as loose inspiration

Fine-Tuning Ready

Z-Image Base was specifically released to unlock community-driven fine-tuning and custom development. Train custom LoRA adapters to encode specific visual styles, characters, or brand aesthetics into reusable weights. This makes it the ideal foundation for building personalized image generation systems.

Bilingual Text Rendering

One of Z-Image’s standout capabilities is its robust bilingual text rendering in both English and Chinese. Industry benchmarks show it outperforms many competitors in poster and text-in-image generation tasks.

Exceptional Value

At just $0.01 per image, Z-Image Base delivers premium quality at a fraction of typical costs—perfect for high-volume generation, rapid prototyping, and creative experimentation.

Use Cases

Professional Content Creation

Marketing teams can generate consistent brand imagery with precise control over style and composition. The reference image guidance ensures visual consistency across campaigns, while negative prompting eliminates common quality issues.

Custom Model Development

Researchers and developers can use Z-Image Base as the foundation for specialized fine-tuned models. The non-distilled architecture preserves all the hooks needed for LoRA training and custom adaptation.

Rapid Prototyping

Product designers and creative directors can quickly iterate through visual concepts at minimal cost. Generate dozens of variations to explore different directions before committing to final designs.

Style-Guided Generation

Artists and illustrators can use reference images to maintain consistent aesthetics across a series. The strength control provides precise calibration between following references and allowing creative freedom.

Batch Content Production

Content creators, e-commerce teams, and social media managers can produce large volumes of images affordably. The combination of low per-image cost and high quality makes Z-Image Base ideal for scaling visual content production.

Getting Started on WaveSpeedAI

Using Z-Image Base through WaveSpeedAI is straightforward. Here’s how to generate your first image using the Python SDK:

import wavespeed

output = wavespeed.run(
    "wavespeed-ai/z-image/base",
    {
        "prompt": "A majestic snow leopard perched on a Himalayan cliff at golden hour, photorealistic, dramatic lighting",
        "negative_prompt": "blurry, distorted, low quality, oversaturated"
    },
)

print(output["outputs"][0])

For reference image guidance, add an image parameter:

import wavespeed

output = wavespeed.run(
    "wavespeed-ai/z-image/base",
    {
        "prompt": "Professional headshot in the same style",
        "image": "https://your-reference-image.jpg",
        "strength": 0.6
    },
)

print(output["outputs"][0])

WaveSpeedAI delivers Z-Image Base with the performance characteristics you expect: fast inference, no cold starts, and transparent pricing. Whether you’re generating a single test image or running thousands through an automated pipeline, you’ll get consistent, reliable results.

Pro Tips for Best Results

Be descriptive with your prompts: Z-Image processes text and image tokens in a single stream, so sentence structure matters. Use clear spatial relationships (“next to,” “behind,” “holding”) to guide composition.
Leverage negative prompts: Since Z-Image Base supports full CFG, use negative prompts strategically. Common additions like “blurry, distorted, extra limbs, watermark” can significantly improve output quality.
Start with strength 0.6 for references: When using reference images, 0.6 provides a good balance. Adjust down for closer reference matching, up for more prompt creativity.
Use the same seed for iterations: Keep the seed constant while tweaking prompts to iterate on a specific composition without starting from scratch each time.
Enable the Prompt Enhancer: The built-in prompt enhancement tool can automatically improve your descriptions for better results.

The Z-Image Advantage

In a landscape increasingly dominated by distilled models that sacrifice control for speed, Z-Image Base stands out by preserving what serious creators need: full CFG support, negative prompting, and fine-tuning capabilities. Combined with its competitive performance on major benchmarks and incredibly affordable pricing, it represents a compelling option for anyone who needs precise control over their AI-generated imagery.

Ready to experience the power and precision of Z-Image Base? Try it now on WaveSpeedAI and discover why this 6-billion parameter model is making waves in the AI image generation community.