MiniMax Hailuo 2.3 Pro is a text-to-video model delivering 1080p videos with 2.5x efficiency and 85% complex-instruction accuracy. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Idle

$0.49per run·~20 / $10
Camera: A slow, steady wide shot (as if gently floating) that moves through a dense, lush, sun-dappled forest. The camera pauses slightly as it reveals a small, friendly forest spirit. Effect: Tiny, glowing dust motes (tree spirits / Kodama) slowly drift and sparkle through the shafts of sunlight. Leaves on the trees gently sway in a soft, visible breeze. A small, forest spirit (like a Kodama or Totoro-esque creature) blinks slowly and turns its head, then nods gently to the camera. Sounds/Voices: Soft, ambient forest sounds: the gentle chirping of unseen birds, the distant trickle of water, and the rustling of leaves in the breeze. A delicate, whimsical flute melody plays softly, accompanied by a faint, magical "tinkle" when the spirit nods. Mood: Whimsical, peaceful, magical, enchanting, and serene. A sense of wonder and gentle calm. Lighting: Warm, golden, dappled sunlight filters through the dense tree canopy, creating soft, glowing patches on the forest floor and highlighting the lush greenery. Subtle lens flares appear in the brightest areas.
Camera: A high-angle helicopter/drone shot overlooking a coastal city, shaking violently. The camera pans from the panicking crowds in the streets to the horizon, revealing the approaching wave. Effect: A colossal tsunami wave, as wide as the city itself and hundreds of feet tall, fills the entire horizon. It moves with terrifying speed, violently impacting the outermost buildings, sending water, cars, and debris exploding hundreds of feet into the air. Sounds/Voices: A deafening, low-frequency "ROAR" of the ocean. The piercing sound of city-wide emergency sirens. The massive, crunching, and crashing sounds of thousands of buildings breaking and collapsing. Mood: Utterly terrifying, apocalyptic, unstoppable, and catastrophic. Lighting: Sickly, grey, overcast daylight. The water is a dark, murky blue-green. Visibility is low due to the mist and spray kicked up by the wave.
Camera: A playful 360-degree orbit shot (medium shot) around three dancers in a bright, candy-themed, pastel-colored set. They are smiling and laughing. Effect: As they perform their signature "heart-hands" point dance (a key move), cartoon-style sparkles and small, colorful hearts pop and animate around their hands. Sounds/Voices: Upbeat, bubbly, fast-paced K-pop or J-pop music. A cute "chime" or "boing" sound effect when the sparkles appear. Audible, light giggles from the members. Mood: Joyful, energetic, sweet, playful, and infectious. Lighting: Extremely bright, high-key, shadowless studio lighting. Soft pink, lavender, and mint-green colors flood the set. Warm, glowing lens flares.
Camera: First-person perspective (POV), the beam of a flashlight is the only viewpoint. The camera moves tensely and slowly down a pitch-black, decaying hospital corridor. The camera suddenly jerks to the right. Effect: The flashlight beam only illuminates a few feet ahead, catching dust motes in the air. As the camera jerks right, the beam briefly illuminates a pale face that vanishes in less than half a second. Voices/Sounds: Only the character's shaky, shallow breathing and the distant echo of a single water drop. A short, sharp violin screech (stinger) hits the moment the face appears. Mood: Extreme tension, claustrophobic, jump-scare, deep unease. Lighting: Total darkness, punctuated only by the narrow, cold-white beam of the unstable handheld flashlight.
A detective stands on a rainy street corner, looking down at a mysterious brass compass in his palm. The needle is spinning wildly. Camera pulls back from a close-up of the compass to reveal the detective's puzzled face. Film noir, neon reflections on wet streets, heavy shadows.
Hailuo 2.3 Pro is the premium text-to-video model from MiniMax, engineered for creators who demand cinematic realism, dynamic motion, and superior visual coherence. It transforms text prompts into richly detailed 5-second 1080p videos — merging professional-grade quality with cutting-edge physical simulation.
| Duration | Resolution | Cost per Job |
|---|---|---|
| 5 seconds | 1080p | $0.49 |
Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/minimax/hailuo-2.3/t2v-pro with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Hailuo 2.3 T2v Pro below.
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/minimax/hailuo-2.3/t2v-pro" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $WAVESPEED_API_KEY" \
-d '{
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"enable_prompt_expansion": true
}'
# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
-H "Authorization: Bearer $WAVESPEED_API_KEY"
# When status is "completed", read the output from data.outputs[0].// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env
const result = await client.run("minimax/hailuo-2.3/t2v-pro", {
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"enable_prompt_expansion": true
});
console.log(result.outputs[0]); // → URL of the generated output# pip install wavespeed
import wavespeed
output = wavespeed.run(
"minimax/hailuo-2.3/t2v-pro",
{
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"enable_prompt_expansion": true
}
)
print(output["outputs"][0]) # → URL of the generated outputHailuo 2.3 T2v Pro is a MiniMax model for video generation, exposed as a REST API on WaveSpeedAI. MiniMax Hailuo 2.3 Pro is a text-to-video model delivering 1080p videos with 2.5x efficiency and 85% complex-instruction accuracy. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.
POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/minimax/minimax-hailuo-2.3-t2v-pro.
Hailuo 2.3 T2v Pro starts at $0.49 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.
Key inputs: `prompt`, `enable_prompt_expansion`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/minimax/minimax-hailuo-2.3-t2v-pro.
Average end-to-end generation time on WaveSpeedAI is around 166 seconds per request — measured across recent runs. Queue time scales with global demand; live status is visible in the prediction record.
Commercial usage rights depend on the model's license, set by its provider (MiniMax). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.