Explore/minimax/voice-clone

text-to-audio

minimax/voice-clone

MiniMax Voice Clone is a state-of-the-art voice synthesis model developed by MiniMax. It enables high-quality voice cloning from a short reference clip, producing speech that closely mimics the tone, accent, and personality of the original speaker.

Hint: You can drag and drop a file or click to upload

Hint: Custom user-defined ID: Must be at least 8 characters long, starting with a letter, and include both letters and numbers (e.g., WaveSpeed-20250717-1050). Duplicate voice IDs will result in an error. This ID can be used with the following models:minimax/speech-02-hd minimax/speech-02-turbo
Enable noise reduction. Default is false (no noise reduction).
Specify whether to enable volume normalization. If not provided, the default value is false.

Idle

Your request will cost $0.5 per run.

For $10 you can run this model approximately 20 times.

ExamplesView all

README

MiniMax Voice Clone

MiniMax Voice Clone is a state-of-the-art voice synthesis model developed by MiniMax. It enables high-quality voice cloning from a short reference clip, producing speech that closely mimics the tone, accent, and personality of the original speaker.

Key Features

  • High-Fidelity Voice Cloning
    Generates speech that is perceptually close to the source speaker with natural prosody and pronunciation.

  • Few-Second Voice Adaptation
    Requires only a few seconds of reference audio to accurately replicate a voice.

  • Emotion and Tone Control
    Allows fine-tuned control over speaking style and emotion, useful for storytelling, games, and character dialogue.

  • Multilingual Output
    Supports voice cloning across different languages and smooth code-switching.

  • Low-Latency Inference
    Optimized for real-time use cases, including live interactions and dialogue generation.

Use Cases

  • AI voiceovers for content creators and influencers
  • Personalized digital assistants and chatbots
  • Audiobook narration in a specific voice
  • Interactive gaming and character voices
  • Assistive speech for individuals with voice loss

Model Overview

MiniMax Voice Clone uses a neural TTS pipeline with robust speaker embedding and prosody modeling. It combines clarity, control, and speed, offering production-ready results in diverse environments.