text-to-audio
Idle
Your request will cost $0.5 per run.
For $10 you can run this model approximately 20 times.
MiniMax Voice Clone is a state-of-the-art voice synthesis model developed by MiniMax. It enables high-quality voice cloning from a short reference clip, producing speech that closely mimics the tone, accent, and personality of the original speaker.
High-Fidelity Voice Cloning
Generates speech that is perceptually close to the source speaker with natural prosody and pronunciation.
Few-Second Voice Adaptation
Requires only a few seconds of reference audio to accurately replicate a voice.
Emotion and Tone Control
Allows fine-tuned control over speaking style and emotion, useful for storytelling, games, and character dialogue.
Multilingual Output
Supports voice cloning across different languages and smooth code-switching.
Low-Latency Inference
Optimized for real-time use cases, including live interactions and dialogue generation.
MiniMax Voice Clone uses a neural TTS pipeline with robust speaker embedding and prosody modeling. It combines clarity, control, and speed, offering production-ready results in diverse environments.