text-to-audio
Idle
Your request will cost $0.5 per run.
For $10 you can run this model approximately 20 times.
MiniMax Voice Clone is a state-of-the-art voice synthesis model developed by MiniMax. It enables high-quality voice cloning from a short reference clip, producing speech that closely mimics the tone, accent, and personality of the original speaker.
High-Fidelity Voice Cloning
Generates speech that is perceptually close to the source speaker with natural prosody and pronunciation.
Few-Second Voice Adaptation
Requires only a few seconds of reference audio to accurately replicate a voice.
Emotion and Tone Control
Allows fine-tuned control over speaking style and emotion, useful for storytelling, games, and character dialogue.
Multilingual Output
Supports voice cloning across different languages and smooth code-switching.
Low-Latency Inference
Optimized for real-time use cases, including live interactions and dialogue generation.
MiniMax Voice Clone uses a neural TTS pipeline with robust speaker embedding and prosody modeling. It combines clarity, control, and speed, offering production-ready results in diverse environments.
Your clone voice ID must be used at least once with one of the voice models on our platform to be saved permanently. Such as:
Otherwise, we can only store it for 7 days. After that, it will be deleted and the voice ID will no longer be callable.
For easier reuse later, please make sure to use your voice ID once in one of the models above after creating it.