WaveSpeedAI APIMinimax Voice Design

Minimax Voice Design

Minimax Voice Design

Playground

Try it on WavespeedAI!

MiniMax Voice Design is a state-of-the-art voice synthesis model developed by MiniMax. Instead of cloning a voice from a reference audio, it generates high-quality voices directly from your textual description, allowing you to create speech with the desired tone, accent, and personality.

Features

MiniMax Voice Design

MiniMax Voice Design is a state-of-the-art voice synthesis model developed by MiniMax. Instead of cloning a voice from a reference audio, it generates high-quality voices based on your textual voice description, allowing you to create speech with the desired tone, accent, and personality.

Key Features

  • High-Fidelity Voice Generation
    Produces speech that matches your description with natural prosody and pronunciation.

  • Flexible Voice Design
    Create a wide range of voices by simply describing the desired characteristics—no reference audio required.

  • Emotion and Tone Control
    Fine-tune speaking style and emotion for storytelling, games, and character dialogue.

  • Multilingual Output
    Supports voice design across different languages and smooth code-switching.

  • Low-Latency Inference
    Optimized for real-time use cases, including live interactions and dialogue generation.

Use Cases

  • AI voiceovers for content creators and influencers
  • Personalized digital assistants and chatbots
  • Audiobook narration in a specific style
  • Interactive gaming and character voices
  • Assistive speech for individuals with voice loss

Model Overview

MiniMax Voice Design uses a neural TTS pipeline with robust speaker and prosody modeling. By leveraging your textual description, it offers clarity, control, and speed, delivering production-ready results in diverse environments.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/minimax/voice-design" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "prompt": "Excited and enthusiastic male product reviewer (e.g., tech vlogger), fast-paced, high energy, and persuasive.",
    "text": "Hello! Welcome to Wavespeed! This is a preview of your cloned voice. I hope you enjoy it"
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
promptstringYes-Voice description.
custom_voice_idstringYes--Custom user-defined ID. Minimum 8 characters; must include letters and numbers and start with a letter (e.g., WaveSpeed001). Duplicate voice-ids will throw an error.
textstringYesHello! Welcome to Wavespeed! This is a preview of your cloned voice. I hope you enjoy it-Text for audio preview. Limited to 500 characters.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Query Parameters

Result Request Parameters

ParameterTypeRequiredDefaultDescription
idstringYes-Task ID

Result Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
dataobjectThe prediction data object containing all details
data.idstringUnique identifier for the prediction, the ID of the prediction to get
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds
© 2025 WaveSpeedAI. All rights reserved.