Seedream 4.0 to 5.0 Complete Tutorial: Text-to-Image, Editing, and Multi-Image Generation
ByteDance’s Seedream family has evolved rapidly from version 4.0 to 5.0, each release bringing new capabilities for image generation, editing, and intelligent reasoning. This tutorial covers the entire 4.0–5.0 range—what each version does best, which model variants to use, and how to get production-quality results through WaveSpeedAI’s API.
Model Family Overview
The Seedream 4.0–5.0 lineup supports three types of input—text, a single image, and multiple images—enabling text-to-image generation, image editing, multi-image fusion, and sequential batch generation with theme consistency.
Each major version has distinct strengths:
| Version | Positioning | Best For | Price (WaveSpeedAI) |
|---|---|---|---|
| 4.0 | High Efficiency | Fast iteration, layout-aware posters, grid designs, cost-sensitive production | $0.027/image |
| 4.5 | Deep Editing & Typography | Portraits, brand visuals, crisp text rendering, 4K poster composition | $0.04/image |
| 5.0-Lite | Lightweight 5.0 | Fast 5.0 generation and editing, accessible entry point | Available now |
| 5.0-Preview | Knowledge & Reasoning | Trending topics, web search, logical reasoning, domain-specific content | Coming soon |
Seedream 4.0: Layout-Aware Generation
Seedream 4.0 is optimized for multi-panel posters, concept designs with copy, series key visuals (KV), and social media assets. It excels at grid-based layouts, whitespace planning for titles and subtitles, and improving text readability.
Key Specs
- Default output: 2048x2048 (2K)
- Maximum resolution: 4096x4096
- Inference speed: ~1.8s for a 2K image
- Aspect ratios: 1:1, 3:2, 4:3, 16:9, 21:9, and custom
Model Variants
Seedream 4.0 ships with four variants on WaveSpeedAI, each designed for a different workflow:
bytedance/seedream-v4 — Text-to-image. Generates images from text prompts. Ideal for posters, concept art, and social media graphics.
bytedance/seedream-v4/edit — Image-to-image. Modifies existing images: outfit swaps, background replacement, material changes, interior redesigns. Supports up to 10 reference images.
bytedance/seedream-v4/sequential — Batch text-to-image. Generates multiple images at once with cross-image consistency. Perfect for character sheets, advertising campaigns, and step diagrams.
bytedance/seedream-v4/edit-sequential — Batch image-to-image. Multi-image input with batch output. Enables multi-image fusion, style transfers across sets, and A/B variant comparisons.
Text-to-Image Prompting (V4)
When prompting Seedream 4.0, specify the subject, layout (grid, triptych, etc.), text placement (title, subtitle, CTA), and preferred style.
2x2 Grid Poster
2x2 grid poster layout, clean margins for typography, title at top center:
"SUMMER COLLECTION", subtitle: "New Arrivals 2026". Panel 1: beachside resort;
Panel 2: sunset cocktail; Panel 3: tropical flowers; Panel 4: ocean waves.
Consistent color grading, cinematic lighting, brand color #3CA2F6,
high legibility background, minimal clutter
Triptych
Horizontal triptych panels, left-to-right narrative: mountain sunrise ->
hiking trail -> summit celebration, unified palette warm earth tones,
soft vignette, clear gutters, strong typographic hierarchy,
space reserved for CTA "START YOUR ADVENTURE"
Minimalist Poster
Minimal poster, large title center: "INNOVATION SUMMIT", small subtitle
below: "March 2026 • San Francisco", single focal object: abstract
geometric sculpture, monochrome + accent #3CA2F6, high legibility
background, grid-based layout
Comic Strip
4-panel comic strip layout, speech bubble placeholders.
Panel 1: developer stares at screen; Panel 2: AI generates solution;
Panel 3: developer celebrates; Panel 4: "It was that easy?"
Bold line art, flat shading, clear gutters, high readability
API Example: Text-to-Image
import wavespeed
output = wavespeed.run(
"bytedance/seedream-v4",
{"prompt": "2x2 grid poster, title: 'TECH EXPO 2026', four futuristic product concepts, clean margins, cinematic lighting, brand color blue"},
)
print(output["outputs"][0])
Image Editing (V4 Edit)
The edit variant modifies existing images while preserving subject identity, lighting, and composition. Use clear, structured prompts following the pattern: action + object + target feature + constraints.
Outfit Change
Outfit swap for portrait, replace clothing with elegant navy blazer;
keep pose and composition; accessories: gold watch;
makeup/hair unchanged; preserve skin tone and lighting;
clean edges, no artifacts
Background Replacement
Background replacement for subject, keep subject edges;
new environment: modern office with floor-to-ceiling windows;
match light direction and color temperature;
soft contact shadows; no haloing
Interior Redesign
Interior finish swap, update wall to exposed brick,
floor to dark hardwood, furniture upholstery to charcoal linen;
layout and lighting unchanged; realistic PBR textures
API Example: Image Editing
import wavespeed
output = wavespeed.run(
"bytedance/seedream-v4/edit",
{
"prompt": "Replace the background with a tropical beach at sunset, match light direction, soft shadows",
"image": "https://example.com/portrait.jpg",
},
)
print(output["outputs"][0])
Sequential Generation (V4 Sequential)
The sequential variant generates multiple images in one call with consistent style, identity, and palette across the set. You must specify the number of images in both the prompt and the max_images parameter.
Character Design Sheet
Generate 6 character sheets of a cyberpunk hacker.
Image 1: neutral pose; Image 2: action pose; Image 3: side profile;
Image 4: back view; Image 5: happy expression; Image 6: serious expression.
Same outfit and palette, clean turnaround style.
Advertising Campaign
Generate 4 poster concepts of the same coffee brand campaign.
Image 1: headline "WAKE UP", morning light;
Image 2: headline "FUEL UP", afternoon energy;
Image 3: headline "WIND DOWN", evening warmth;
Image 4: headline "DREAM ON", night ambiance.
Keep brand color brown/gold, consistent grid and margins, cinematic lighting.
API Example: Sequential Generation
import wavespeed
output = wavespeed.run(
"bytedance/seedream-v4/sequential",
{
"prompt": "Generate 4 images of a sneaker in different colorways. Image 1: white/blue; Image 2: black/gold; Image 3: red/white; Image 4: green/cream. Studio lighting, identical angle and composition, clean background.",
"max_images": 4,
},
)
for url in output["outputs"]:
print(url)
Cost note: The sequential model charges per
max_images, not per actual output. If you setmax_images=4but only describe 2 images in your prompt, you’ll still be charged for 4. Always match the number in your prompt tomax_images.
Seedream 4.5: Typography and Deep Editing
Seedream 4.5 builds on 4.0 with significant improvements in text rendering, prompt adherence, aesthetic quality, and reference image consistency. It’s the recommended choice for any work involving typography, branded visuals, or portrait editing.
Key Improvements Over 4.0
- Enhanced typography: Sharp, legible text for posters, logos, UI, and marketing layouts
- Designer-level composition: Handles complex poster-style layouts with clear hierarchy
- Stronger prompt adherence: Closely follows detailed descriptions for subjects, layout, and style
- Higher resolution: Supports 2560x1440 up to 4096x4096 (higher minimum than V4)
- Better reference consistency: Preserves facial features, lighting, and color tone from reference images
Model Variants
Like V4, Seedream 4.5 offers four variants on WaveSpeedAI:
| Variant | Model Path | Type | Use Case |
|---|---|---|---|
| Base | bytedance/seedream-v4.5 | Text-to-Image | Typography-heavy posters, brand visuals |
| Edit | bytedance/seedream-v4.5/edit | Image-to-Image | Portrait editing, product retouching |
| Sequential | bytedance/seedream-v4.5/sequential | Batch T2I | Consistent series, campaign sets |
| Edit-Sequential | bytedance/seedream-v4.5/edit-sequential | Batch I2I | Multi-image fusion, style transfers |
Recommended Resolutions (V4.5)
| Aspect Ratio | Suggested Resolution |
|---|---|
| 1:1 | 2048x2048 |
| 4:3 | 2688x2016 |
| 3:2 | 2688x1792 |
| 16:9 | 2560x1440 |
| Square 4K | 4096x4096 |
Text Rendering Best Practices
Seedream 4.5’s standout feature is accurate text generation within images. Follow these guidelines for best results:
- Use double quotes around text that must appear in the image:
Generate a poster with the title "Seedream 4.5" - Specify font characteristics: “bold sans-serif”, “elegant script”, “handwritten”
- Describe text placement: “title top-center”, “subtitle below”, “CTA bottom-right”
- Keep text short: 1–10 words work best; long paragraphs may have inconsistencies
- Use higher resolutions: 2048x2048 or above gives noticeably cleaner typography
Example: Brand Poster
Minimalist tech conference poster, dark navy background.
Large white all-caps title at the top: "AI SUMMIT 2026".
Small gray subtitle below: "San Francisco • June 15-17".
Abstract holographic geometric shape centered.
Brand color accent #3CA2F6. Clean grid layout, generous whitespace.
API Example: Typography-Heavy Generation
import wavespeed
output = wavespeed.run(
"bytedance/seedream-v4.5",
{
"prompt": "Coffee shop menu board, chalkboard style, title 'DAILY SPECIALS' in bold chalk lettering, items: Espresso $3, Latte $4, Cappuccino $4.50, warm ambient lighting, cozy cafe atmosphere",
"size": "2048x2048",
},
)
print(output["outputs"][0])
Reference-Based Generation (V4.5 Edit)
Seedream 4.5 Edit excels at extracting and preserving visual characteristics from reference images:
Color Grading Transfer
Change Image 1's color tone to match Image 2's color tone
Makeup Transfer
Transfer the makeup from Image 2 onto the person in Image 1
Brand Style Application
Apply Image 1's brand design style to the product in Image 2,
create a similar brand series promotional image,
include all design modules from Image 1
Seedream 5.0-Preview: Intelligence and Reasoning
Seedream 5.0-Preview introduces capabilities that go beyond traditional image generation. It prioritizes knowledge and intelligence over pure aesthetics, adding real-time web search, precise editing control, and advanced logical reasoning.
Note: For pure visual beauty and photorealism, Seedream 4.5 remains the recommended choice. The full 5.0 release will combine both intelligence and aesthetics.
Real-Time Web Search
5.0-Preview is the first image generation model to support search-based generation. The model intelligently determines when to search based on your prompt:
- Time-sensitive terms: Recent product releases, current events
- Specific entities: Celebrities, brands, locations
- Long-tail queries: Niche topics requiring factual accuracy
Example prompts that trigger search:
Generate iPhone 17 Pro Max concept design
Reference the Duolingo app interface, design a vocabulary
flashcard page with word and streak counter, incorporate
the green owl mascot
Generate a Nordic Winter Olympics poster: Norwegian aurora
background, skier in national uniform, include Olympic
elements and mascot
Intelligent Logical Reasoning
5.0-Preview handles complex operations that require understanding context and multi-step decision-making:
Classification and Distribution
Classify the flowers in Image 1 by variety, arrange them
separately in the three vases shown in Image 2
Physical World Understanding
Two stationery rulers, top is a 20cm plastic ruler,
bottom is a 10cm steel ruler
3D Reasoning
Generate the 3D assembled form based on the packaging
flat layout diagram
Domain-Specific Knowledge
Reference this set of CAD drawings, generate a realistic
building visualization
Human respiratory system anterior view diagram showing:
nasal cavity, nostrils, oral cavity, pharynx, larynx,
trachea, left and right main bronchi, left and right
lungs, and diaphragm
Example-Based Editing
Instead of describing complex transformations, show the model what you want with before/after examples:
Reference the change from Image 1 to Image 2, apply the
same operation to Image 3
This works for hairstyle changes, scene swaps, material transformations, and perspective shifts.
Prompt Engineering Guide
These tips apply across all Seedream 4.0–5.0 versions.
Use Natural Language, Not Tag Lists
Write coherent narratives rather than fragmented keyword lists:
Avoid:
girl, lavish dress, parasol, tree-lined path, oil painting, Monet style
Prefer:
A girl in a lavish dress walking under a parasol along a tree-lined path,
in the style of a Monet oil painting
Prompt Structure Formula
[Subject] + [Action/Pose] + [Environment/Setting] + [Style] + [Technical Details] + [Text Content]
Example:
A professional barista (subject) crafting latte art (action) in a modern
specialty coffee shop (environment), photorealistic style (style),
warm morning light through large windows, shallow depth of field (technical),
a chalkboard behind them reading "ARTISAN ROASTERS" (text content)
Editing Prompts
For image editing, use specific, unambiguous instructions that explicitly state what changes and what stays the same:
Avoid: Make it look better
Prefer: Replace the overcast sky with a vivid sunset backdrop, warm orange tones; keep the building and foreground unchanged
Visual Markup for Complex Edits
When text descriptions alone aren’t enough for precise positioning, use arrows, bounding boxes, or doodles on the reference image to designate specific regions for modification.
Common Mistakes
- Conflicting instructions: “Photorealistic cartoon character” — choose one style direction
- Overcomplicating prompts: Start simple, add detail incrementally
- Ignoring aspect ratio: Match dimensions to your use case (square for social media, landscape for banners)
- Vague editing instructions: Avoid pronouns like “change it” — specify what “it” is
Choosing the Right Version
Quick Decision Guide
- Need speed and low cost? → Seedream 4.0
- Need crisp text in images? → Seedream 4.5
- Need brand-quality posters? → Seedream 4.5
- Need consistent multi-image sets? → V4 or V4.5 Sequential
- Need to edit existing photos? → V4 or V4.5 Edit
- Need current-event imagery? → Seedream 5.0-Preview
- Need knowledge-driven content? → Seedream 5.0-Preview
Detailed Comparison
| Capability | 4.0 | 4.5 | 5.0-Preview |
|---|---|---|---|
| Text-to-Image | Yes | Yes | Yes |
| Image Editing | Yes | Yes (better) | Yes |
| Multi-Image | Yes | Yes | Yes |
| Sequential Generation | Yes | Yes | Yes |
| Text Rendering | Good | Excellent | Good |
| Web Search | No | No | Yes |
| Logical Reasoning | Basic | Basic | Advanced |
| Max Resolution | 4096x4096 | 4096x4096 | 4K |
| Min Resolution | ~320x320 | 2560x1440 | — |
| Speed | Fastest | Moderate | Moderate |
| Cost | $0.027 | $0.04 | — |
Version Limitations
Seedream 4.0: Small text may repeat or degrade; editing accuracy lower than 4.5.
Seedream 4.5: Occasional blur or cropping issues; higher cost and generation time than 4.0.
Seedream 5.0-Preview: Some AI-generated appearance; occasional proportion issues; text structure instability; limited chart/data reasoning. Currently prioritizes intelligence over aesthetics.
All Available Models on WaveSpeedAI
| Model | Type | Price | Best For |
|---|---|---|---|
bytedance/seedream-v4 | Text-to-Image | $0.027 | Posters, grid layouts, concept designs |
bytedance/seedream-v4/edit | Image-to-Image | $0.027 | Outfit swaps, background changes, retouching |
bytedance/seedream-v4/sequential | Batch T2I | $0.027/image | Character sheets, campaign sets |
bytedance/seedream-v4/edit-sequential | Batch I2I | $0.027/image | Multi-image fusion, A/B variants |
bytedance/seedream-v4.5 | Text-to-Image | $0.04 | Typography, brand visuals, 4K posters |
bytedance/seedream-v4.5/edit | Image-to-Image | $0.04 | Portrait editing, style/feature transfer |
bytedance/seedream-v4.5/sequential | Batch T2I | $0.04/image | Branded series, consistent campaigns |
bytedance/seedream-v4.5/edit-sequential | Batch I2I | $0.04/image | Multi-image editing, design exploration |
bytedance/seedream-v5.0-lite | Text-to-Image | — | 5.0 generation |
bytedance/seedream-v5.0-lite/edit | Image-to-Image | — | 5.0 editing |
bytedance/seedream-v5.0-lite/sequential | Batch T2I | — | 5.0 batch generation |
bytedance/seedream-v5.0-lite/edit-sequential | Batch I2I | — | 5.0 batch editing |
Getting Started
- Sign up at WaveSpeedAI and get your API key
- Install the SDK:
pip install wavespeed - Pick your model based on the decision guide above
- Write your prompt using the structure formula and best practices
- Generate and iterate: Refine prompts based on results
import wavespeed
# Text-to-Image with Seedream 4.5
output = wavespeed.run(
"bytedance/seedream-v4.5",
{"prompt": "A sleek product showcase poster, title 'NEXT GEN' in bold white sans-serif, dark gradient background, floating smartphone with holographic screen, cinematic lighting, brand color #3CA2F6"},
)
print(output["outputs"][0])
import wavespeed
# Image Editing with Seedream 4.0
output = wavespeed.run(
"bytedance/seedream-v4/edit",
{
"prompt": "Change the outfit to a formal black suit, keep the same pose and background lighting",
"image": "https://example.com/portrait.jpg",
},
)
print(output["outputs"][0])
import wavespeed
# Sequential Generation with Seedream 4.0
output = wavespeed.run(
"bytedance/seedream-v4/sequential",
{
"prompt": "Generate 3 step-by-step tutorial visuals for making pour-over coffee. Image 1: grinding beans; Image 2: pouring water in circular motion; Image 3: finished cup with steam. Uniform warm style, numbered labels.",
"max_images": 3,
},
)
for url in output["outputs"]:
print(url)
Whether you’re building marketing automation, creating social media content at scale, or developing creative applications, the Seedream 4.0–5.0 family on WaveSpeedAI provides the full spectrum from fast iteration to intelligent, knowledge-driven generation.




