GPT Image 2 ARTIK YAYINDA. Image Generator'da Deneyin→

Minimax Hailuo Models

Minimax Hailuo 2.3 for professional video generation, plus speech synthesis models.

Minimax Hailuo 2.3 for professional video generation, plus speech synthesis models.

All Models

33 models
image-to-video

minimax/video-01

Minimax Video-01 is a text-to-video model offering high compression, strong text responsiveness, cinematic styles, and native HD output. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

minimax/video-02

Hailuo 02 is an AI video generation model fine-tuned for ultra-clear 1080P output and handling complex physics-driven scenes. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

minimax/hailuo-02/standard

Hailuo 02 is an AI video-generation model delivering 768P output with fast responsiveness and strong handling of complex physics scenes. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

minimax/hailuo-02/pro

Minimax Hailuo 02 Pro produces ultra-clear 1080P AI videos with responsive, physics-aware rendering for complex physics-driven scenes. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-video

minimax/hailuo-02/t2v-standard

Hailuo 02 is a text-to-video model on MiniMax, fine-tuned to output responsive 768P videos even for complex physics-driven scenes. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

minimax/hailuo-02/i2v-standard

Hailuo 02 by Hailuo AI is an image-to-video model delivering ultra-clear 768P video with responsive handling of physics-driven scenes. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-video

minimax/hailuo-02/t2v-pro

Hailuo 02 T2V-Pro is a text-to-video model fine-tuned for ultra-clear 1080P video and responsive handling of physics-driven scenes. Ready-to-use REST API, no coldstarts, best performance, affordable pricing.

image-to-video

minimax/hailuo-02/i2v-pro

MiniMax Hailuo 02 Pro, an image-to-video model tuned for clear 1080P output and responsive handling of complex physics-driven scenes. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/speech-02-hd

Minimax Speech 02 HD is Minimax's high-definition text-to-speech model delivering clear HD voices; pricing $0.05 per 1,000 characters. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/speech-02-turbo

Minimax Speech-02 Turbo is a high-definition text-to-speech model delivering natural voice output. Cost: $0.03 per 1000 characters. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/voice-clone

Minimax Voice Clone creates high-quality voice clones from short reference clips, closely matching tone, accent, and speaking style. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/voice-design

MiniMax Voice Design generates natural voices from textual descriptions - no cloning - lets you set tone, accent and personality. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

minimax/hailuo-02/fast

Hailuo 02 Fast is a minimax image-to-video model that creates high-quality 6s and 10s clips at 512p for creators and marketers. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/speech-2.5-hd-preview

MiniMax Speech 2.5 HD Preview offers HD TTS with enhanced multilingual expressiveness, accurate voice cloning, and 40-language support. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/speech-2.5-turbo-preview

Minimax Speech 2.5 Turbo Preview: HD TTS with multilingual support, accurate voice replication across 40 languages. $0.04/1000 chars. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/music-v1.5

MiniMax Music v1.5 turns text prompts into high-quality, diverse music (Text-to-Audio) using advanced AI for versatile tracks. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/music-01

Minimax Music-01 Synthesizes Accompaniment And Vocals Simultaneously To Produce Complete Songs Across Diverse Styles. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-video

minimax/hailuo-2.3/t2v-pro

MiniMax Hailuo 2.3 Pro is a text-to-video model delivering 1080p videos with 2.5x efficiency and 85% complex-instruction accuracy. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-video

minimax/hailuo-2.3/t2v-standard

Hailuo 2.3 is a text-to-video model creating physics-aware 768p videos with 2.5× efficiency and 85% complex instruction response rate. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

minimax/hailuo-2.3/i2v-standard

MiniMax Hailuo 2.3 Standard is an image-to-video model producing physics-aware 768p output with a 2.5x efficiency improvement. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

minimax/hailuo-2.3/i2v-pro

MiniMax Hailuo 2.3 Pro is an image-to-video model for ultra-clear 1080P output and physics-aware scenes with responsive rendering. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

minimax/hailuo-2.3/fast

Hailuo 2.3 Fast by minimax generates high-quality 6s and 10s image-to-video clips at 768p, optimized for creators and marketers. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/speech-2.6-hd

Minimax Speech 2.6 HD: Ultra-human, low-latency (< 250ms) TTS with voice cloning, text normalization and support for 40+ languages. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/speech-2.6-turbo

Minimax Speech 2.6 Turbo is a Text-to-Speech model offering ultra-human voice cloning, industry-leading text normalization, sub-250ms latency and 40+ language support. Pricing: $0.06 per 1000 characters. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

minimax/hailuo-2.3/fast-pro

Hailuo 2.3 Fast Pro converts images into high-quality 6s 1080p videos, delivering fast, affordable results for creators and marketers. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/music-02

Minimax Music-02 is a compact, fast, cost-effective MoE music generator (230B params, 10B active) for high-quality music production. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-image

minimax/image-01/image-to-image

MiniMax Image-01 image-to-image model transforms existing images using text prompts. Generate variations, apply style transfers, or modify images with character references. Supports multiple aspect ratios and custom dimensions. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-image

minimax/image-01/text-to-image

MiniMax Image-01 text-to-image model generates high-quality images from text descriptions. Create diverse visuals across multiple styles and scenarios with natural language prompts. Supports multiple aspect ratios and custom dimensions. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/speech-2.8-turbo

MiniMax Speech 2.8 Turbo is a high-definition text-to-speech model with natural and expressive voice synthesis. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/speech-2.8-hd

MiniMax Speech 2.8 HD is a high-definition text-to-speech model with natural and expressive voice synthesis for premium audio quality. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/music-2.5

MiniMax Music 2.5 is a full-dimensional breakthrough in AI music generation with high-fidelity audio, humanized vocals, and precise creative control. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/music-2.6

MiniMax Music 2.6 generates complete songs with vocals and instrumentals from text prompts and lyrics. Supports instrumental-only mode, auto lyrics generation, structure tags for song arrangement, and configurable audio quality. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

minimax/music-cover

MiniMax Music Cover transforms existing songs into completely different styles — new arrangement, new vocal character, same melody. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

Minimax Hailuo Models

Series Advantages

  1. 1080p native clarity — not upscaled; cleaner detail and steadier temporal coherence.
  2. Strong instruction following — reliable execution of camera moves, lighting, and motion cues.
  3. Physics realism — debris, cloth, water, collisions, and handheld shake feel believable.
  4. Clip-first workflow — 6s / 10s lengths enable fast iteration and easy sequencing.
  5. Two creation modes — pure Text-to-Video (T2V) and Image-to-Video (I2V) across the lineup.

Model Lineup

  1. hailuo-2.3/t2v-standard

Text-to-Video (standard); upgraded from hailuo 02 with smoother motion, cleaner faces, and more stable scene dynamics.

  1. hailuo-2.3/i2v-standard

Image-to-Video (standard); refined transition flow and stronger visual consistency compared to hailuo 02.

  1. hailuo-2.3/t2v-pro

Text-to-Video (pro); higher fidelity and motion realism than standard, with richer detail and better expression control.

  1. hailuo-2.3/i2v-pro

Image-to-Video (pro); enhanced texture depth and temporal coherence for premium production needs.

  1. hailuo-2.3/fast

Fast mode (I2V); optimized for quick generation and batch testing—same model core, faster output.

  1. hailuo-02/standard

Unified endpoint for T2V + I2V; clean visuals and stable timing for everyday production.

  1. hailuo-02/t2v-standard

Text-to-Video (standard); dependable camera motion and physics for scripts, shorts, and explainers.

  1. hailuo-02/i2v-standard

Image-to-Video (standard); lock composition/style with a start image (optional end image) for smooth guided transitions.

  1. hailuo-02/t2v-pro

Text-to-Video (pro); stronger physics, cleaner temporal flow, and higher fidelity for hero shots.

  1. hailuo-02/i2v-pro

Image-to-Video (pro); richer micro-detail and color depth—ideal for animating key visuals and poster-grade stills.

  1. hailuo-02/fast

Fast iteration (T2V/I2V); built for rapid drafts, batch A/B testing, and high-throughput pipelines.

  1. minimax/speech-2.8-turbo

Real-time synthesis (T2A); optimized for ultra-low latency and cost-efficiency—built for conversational AI, live streaming, and instant interaction loops.

  1. minimax/speech-2.8-hd

Studio-grade fidelity (T2A); superior dynamic range and emotional nuance—ideal for audiobooks, cinematic narration, and professional content creation.



Quick guidance: Standard covers most day-to-day needs; choose Pro for hero-quality shots; use Fast for speed and volume.