WaveSpeedAI APIMinimax Video 02

Minimax Video-02

Hailuo 02 - MiniMax's next-generation AI video model with 2.5x efficiency improvement, 85% complex instruction response rate, and industry-leading cost-effectiveness for generating high-quality 6-second videos.

Features

Hailuo 02 - Next Generation AI Video Model

Hailuo 02 is MiniMax's revolutionary AI video generation model, representing a significant upgrade from Hailuo 01. Currently ranked #2 globally in both image-to-video and text-to-video benchmarks, surpassing Kuaishou's Kling and Google's Veo3, second only to ByteDance's recently released Seedance 1.0.

🚀 Model Highlights

Industry-Leading Performance

  • 2.5x Efficiency Boost: Both training and inference efficiency improved by 250%
  • 3x Model Parameters: Significantly enhanced model capacity
  • 4x Training Data: Massive dataset expansion for superior quality
  • 85% Complex Instruction Response Rate: Exceptional understanding of intricate prompts

Architectural Innovation

Hailuo 02 features a completely redesigned DiT (Diffusion Transformer) architecture, abandoning the previous framework for a more efficient and powerful system that delivers:

  • Enhanced temporal consistency
  • Superior motion dynamics
  • Exceptional physical realism

Cost-Effectiveness Champion

  • Lowest Price in Top Tier: Most affordable among leading video generation models
  • Unchanged Training Costs: Despite 2.5x efficiency gains
  • Premium Quality at Budget Price: Professional-grade results without premium pricing

🎯 Key Features

Resolution Options

  • 768p: Standard quality for quick previews and drafts
  • 1080p: Full HD for professional applications

Advanced Capabilities

  • Extreme Physics Simulation: Generates complex physical scenarios like acrobatics, fluid dynamics, and intricate movements
  • Cinematic Camera Control: Professional camera movements including panning, tilting, tracking, and complex trajectories
  • Multi-Style Support: From photorealistic to artistic, anime to documentary styles
  • Consistent Character Generation: Maintains character appearance throughout the video

💡 Application Scenarios

Film & Television Production

Rapidly generate complex VFX shots, including acrobatics, fantasy scenes, and challenging physical performances, dramatically reducing production costs and time.

Advertising & Creative

Provide brands with cost-effective, high-quality video content that meets diverse creative requirements while maintaining professional standards.

Content Creation

Empower creators and influencers to produce engaging video content efficiently, enhancing productivity without compromising quality.

Educational Entertainment

Generate educational animations, virtual performances, and engaging content that combines learning with entertainment value.

Corporate Communications

Offer SMEs affordable promotional videos that elevate brand image and market competitiveness without breaking the budget.

📊 Technical Specifications

  • Video Duration: 6 seconds (with plans for extended duration)
  • Frame Rate: 25 fps
  • Supported Formats: MP4, MOV
  • Input Types: Text prompts, reference images
  • Processing Time: Optimized for rapid generation

🔧 Usage Guidelines

Best Practices

  1. Detailed Prompts: Leverage the 85% complex instruction response rate with comprehensive descriptions
  2. High-Quality References: Use clear, high-resolution images for image-to-video generation
  3. Style Consistency: Specify desired artistic style for coherent results
  4. Physics Descriptions: Take advantage of advanced physics capabilities with specific motion descriptions

Limitations

  • Current maximum duration: 6 seconds
  • Output quality depends on input prompt/image quality
  • Designed for creative synthesis, not documentary accuracy

🛡️ Responsible Use

This model must not be used for:

  • Generating harmful, illegal, or deceptive content
  • Creating non-consensual or inappropriate material
  • Violating privacy or intellectual property rights
  • Spreading misinformation or propaganda
  • Any activity violating local or international laws

🌟 Why Choose Hailuo 02?

  1. Performance Leader: #2 globally, surpassing established competitors
  2. Cost Efficiency: Best price-performance ratio in the industry
  3. Technical Excellence: 2.5x efficiency with 3x parameters
  4. Versatility: Handles extreme complexity with ease
  5. Future-Ready: Continuous improvements and feature expansions

Experience the next generation of AI video generation with Hailuo 02 - where cutting-edge technology meets practical affordability.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/minimax/video-02" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "prompt": "Circus Scene. The camera follows a clown riding unicycle while jugging balls. The camera pulls back, tracks left, and tilts loft",
    "resolution": "768p",
    "duration": 6,
    "enable_prompt_expansion": true
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
promptstringYesCircus Scene. The camera follows a clown riding unicycle while jugging balls. The camera pulls back, tracks left, and tilts loft-Generate a description of the video.(Note: Maximum support 2000 characters). 1. Support inserting mirror operation instructions to realize mirror operation control: mirror operation instructions need to be inserted into the lens application position in prompt in the format of [ ]. The standard mirror operation instruction format is [C1,C2,C3], where C represents different types of mirror operation. In order to ensure the effect of mirror operation, it is recommended to combine no more than 3 mirror operation instructions. 2. Support natural language description to realize mirror operation control; using the command internal mirror name will improve the accuracy of mirror operation response. 3. mirror operation instructions and natural language descriptions can be effective at the same time.
imagestringNo--The model generates video with the picture passed in as the first frame.Base64 encoded strings in data:image/jpeg; base64,{data} format for incoming images, or URLs accessible via the public network. The uploaded image needs to meet the following conditions: Format is JPG/JPEG/PNG; The aspect ratio is greater than 2:5 and less than 5:2; Short side pixels greater than 300px; The image file size cannot exceed 20MB.
resolutionstringNo768p-Video resolution
durationintegerNo66Video duration in seconds
enable_prompt_expansionbooleanNotrue-The model automatically optimizes incoming prompts to improve build quality.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Query Parameters

Result Request Parameters

ParameterTypeRequiredDefaultDescription
idstringYes-Task ID

Result Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
dataobjectThe prediction data object containing all details
data.idstringUnique identifier for the prediction
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds
© 2025 WaveSpeedAI. All rights reserved.