Home/Explore/Hailuo Video Models/minimax/voice-clone

text-to-audio

minimax/voice-clone

MiniMax Voice Clone is a state-of-the-art voice synthesis model developed by MiniMax. It enables high-quality voice cloning from a short reference clip, producing speech that closely mimics the tone, accent, and personality of the original speaker.

Hint: You can drag and drop a file or click to upload

Hint: Custom user-defined ID: Must be at least 8 characters long, starting with a letter, and include both letters and numbers (e.g., WaveSpeed-20250717-1050). Duplicate voice IDs will result in an error. This ID can be used with the following models:minimax/speech-02-hd minimax/speech-02-turbo
Enable noise reduction. Default is false (no noise reduction).
Specify whether to enable volume normalization. If not provided, the default value is false.

Idle

Your request will cost $0.5 per run.

For $10 you can run this model approximately 20 times.

ExamplesView all

README

MiniMax Voice Clone

MiniMax Voice Clone is a state-of-the-art voice synthesis model developed by MiniMax. It enables high-quality voice cloning from a short reference clip, producing speech that closely mimics the tone, accent, and personality of the original speaker.

Key Features

  • High-Fidelity Voice Cloning
    Generates speech that is perceptually close to the source speaker with natural prosody and pronunciation.

  • Few-Second Voice Adaptation
    Requires only a few seconds of reference audio to accurately replicate a voice.

  • Emotion and Tone Control
    Allows fine-tuned control over speaking style and emotion, useful for storytelling, games, and character dialogue.

  • Multilingual Output
    Supports voice cloning across different languages and smooth code-switching.

  • Low-Latency Inference
    Optimized for real-time use cases, including live interactions and dialogue generation.

Use Cases

  • AI voiceovers for content creators and influencers
  • Personalized digital assistants and chatbots
  • Audiobook narration in a specific voice
  • Interactive gaming and character voices
  • Assistive speech for individuals with voice loss

Model Overview

MiniMax Voice Clone uses a neural TTS pipeline with robust speaker embedding and prosody modeling. It combines clarity, control, and speed, offering production-ready results in diverse environments.

Note

Your clone voice ID must be used at least once with one of the voice models on our platform to be saved permanently. Such as:

Otherwise, we can only store it for 7 days. After that, it will be deleted and the voice ID will no longer be callable.

For easier reuse later, please make sure to use your voice ID once in one of the models above after creating it.