Minimax Voice Design
Playground
Try it on WavespeedAI!MiniMax Voice Design is a state-of-the-art voice synthesis model developed by MiniMax. Instead of cloning a voice from a reference audio, it generates high-quality voices directly from your textual description, allowing you to create speech with the desired tone, accent, and personality.
Features
MiniMax Voice Design
MiniMax Voice Design is a state-of-the-art voice synthesis model developed by MiniMax. Instead of cloning a voice from a reference audio, it generates high-quality voices based on your textual voice description, allowing you to create speech with the desired tone, accent, and personality.
Key Features
-
High-Fidelity Voice Generation
Produces speech that matches your description with natural prosody and pronunciation. -
Flexible Voice Design
Create a wide range of voices by simply describing the desired characteristics—no reference audio required. -
Emotion and Tone Control
Fine-tune speaking style and emotion for storytelling, games, and character dialogue. -
Multilingual Output
Supports voice design across different languages and smooth code-switching. -
Low-Latency Inference
Optimized for real-time use cases, including live interactions and dialogue generation.
Use Cases
- AI voiceovers for content creators and influencers
- Personalized digital assistants and chatbots
- Audiobook narration in a specific style
- Interactive gaming and character voices
- Assistive speech for individuals with voice loss
Model Overview
MiniMax Voice Design uses a neural TTS pipeline with robust speaker and prosody modeling. By leveraging your textual description, it offers clarity, control, and speed, delivering production-ready results in diverse environments.
Note
Your custom voice ID must be used at least once with one of the voice models on our platform to be saved permanently. Such as:
Otherwise, we can only store it for 7 days. After that, it will be deleted and the voice ID will no longer be callable.
For easier reuse later, please make sure to use your voice ID once in one of the models above after creating it.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/minimax/voice-design" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"prompt": "Excited and enthusiastic male product reviewer (e.g., tech vlogger), fast-paced, high energy, and persuasive.",
"text": "Hello! Welcome to Wavespeed! This is a preview of your cloned voice. I hope you enjoy it"
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
Parameter | Type | Required | Default | Range | Description |
---|---|---|---|---|---|
prompt | string | Yes | - | Voice description. | |
custom_voice_id | string | Yes | - | - | Custom user-defined ID. Minimum 8 characters; must include letters and numbers and start with a letter (e.g., WaveSpeed001). Duplicate voice-ids will throw an error. |
text | string | Yes | Hello! Welcome to Wavespeed! This is a preview of your cloned voice. I hope you enjoy it | - | Text for audio preview. Limited to 500 characters. |
Response Parameters
Parameter | Type | Description |
---|---|---|
code | integer | HTTP status code (e.g., 200 for success) |
message | string | Status message (e.g., “success”) |
data.id | string | Unique identifier for the prediction, Task Id |
data.model | string | Model ID used for the prediction |
data.outputs | array | Array of URLs to the generated content (empty when status is not completed ) |
data.urls | object | Object containing related API endpoints |
data.urls.get | string | URL to retrieve the prediction result |
data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
data.status | string | Status of the task: created , processing , completed , or failed |
data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
data.error | string | Error message (empty if no error occurred) |
data.timings | object | Object containing timing details |
data.timings.inference | integer | Inference time in milliseconds |