MiniMax Image-01 Image-to-Image
MiniMax Image-01 Image-to-Image is an advanced AI model that transforms existing images using text prompts. Part of the MiniMax image-01 family, this model enables you to generate variations, apply style transfers, modify compositions, and create character-consistent images from reference photos. Perfect for creative workflows, product visualization, and content creation.
Key Features
-
Image-Based Generation
Generate new images based on an existing image input combined with text prompts. The model intelligently understands the reference image and applies your text description to create variations.
-
Character Reference Support
Use portrait photos as character references to maintain consistent character appearance across generated images. Ideal for creating character variations, different poses, or placing characters in new scenes.
-
Flexible Image Dimensions
Specify custom dimensions from 512×512 to 2048×2048 pixels (must be divisible by 8) for precise control over output size. Common sizes include 1024×1024, 1280×720, 1152×864, and more.
-
Prompt Optimization
Built-in prompt optimizer automatically enhances your text descriptions for better generation results, helping you achieve the desired output even with simple prompts.
-
Batch Generation
Generate up to 9 images in a single request, perfect for exploring variations and selecting the best result.
-
Reproducible Results
Use seed values to generate consistent results across multiple runs, essential for iterative refinement and production workflows.
Use Cases
- Product Visualization: Transform product photos into different contexts, backgrounds, or styles
- Character Art: Create consistent character variations with different poses, outfits, or environments
- Style Transfer: Apply artistic styles or visual treatments to existing images
- Image Editing: Modify specific aspects of an image through natural language descriptions
- Creative Exploration: Generate multiple variations of a concept for selection and refinement
- Content Creation: Quickly produce social media content, marketing materials, or creative assets
Supported Formats & Dimensions
Input Image Formats:
- JPG, JPEG, PNG
- Maximum file size: 10MB
- Accepts public URLs or Base64-encoded Data URLs
Output Dimensions:
- Width/Height range: 512 to 2048 pixels
- Must be divisible by 8
- Common sizes: 1024×1024 (square), 1280×720 (widescreen), 1152×864 (standard), 1248×832 (photo), 832×1248 (portrait photo), 864×1152 (portrait), 720×1280 (mobile/vertical), 1344×576 (ultra-wide)
How to Use
Basic Image-to-Image Generation
-
Upload Reference Image
- Provide a public URL or Base64-encoded image in the
image field
- Ensure the image is in JPG, JPEG, or PNG format and under 10MB
-
Write Your Prompt
- Describe the desired output in the
prompt field (max 1500 characters)
- Example: "Transform this photo into a watercolor painting style with soft pastel colors"
-
Select Image Size
- Specify dimensions using the
size parameter like "10241024" or "1280720"
- Choose dimensions that match your use case (square for social media, widescreen for presentations, etc.)
-
Configure Options
num_images: Set 1-9 to generate multiple variations
prompt_optimizer: Enable for automatic prompt enhancement
seed: Use for reproducible results
-
Generate
- Submit your request and receive generated images as URLs or Base64 strings
Character Reference Generation
For consistent character appearance:
-
Prepare Portrait Photo
- Use a clear, front-facing portrait with good lighting
- Single person in frame works best
-
Configure Character Reference
- Use the
subject_reference parameter with type "character"
- Provide the portrait image URL or Base64 data
-
Describe the Scene
- Write a prompt describing the desired scene, pose, or context
- Example: "The character standing on a mountain peak at sunset"
API Parameters
- prompt (required): Text description of desired output (max 1500 chars)
- image (required): Reference image as URL or Base64 string
- size: Image dimensions (e.g., "1024 * 1024", "1280 * 720")
- num_images: Number of images to generate (1-9, default: 1)
- seed: Random seed for reproducible results
- prompt_optimizer: Enable automatic prompt enhancement (boolean)
- enable_base64_output: Return Base64 instead of URLs (boolean)
- enable_sync_mode: Wait for generation to complete before returning (boolean)
- subject_reference: Array of character reference images (optional)
Pricing
- $0.0035 per image
- Generate multiple images in one request for efficient batch processing
- Total cost = $0.0035 × number of images generated
Output Format
Generations return as:
- URLs (default): Direct links to generated images hosted on WaveSpeedAI
- Base64 (optional): Encoded image data for direct embedding
Response includes:
- Unique request ID for tracking
- Generation status (created, processing, completed, failed)
- Output array with generated image URLs or Base64 data
- NSFW content detection flags
- Creation timestamp
Best Practices
- Use clear, well-lit reference images for best results
- For character references, front-facing portraits work best
- Enable prompt optimizer if you're new to prompt writing
- Use seed values when iterating on a specific result
- Generate multiple variations (num_images > 1) to select the best output
- Keep prompts descriptive but concise for optimal results
Related Models
Also available on WaveSpeedAI:
- minimax/image-01/text-to-image - Generate images from text prompts only