
text-to-audio
Idle
Ihre Anfrage kostet $0.06 pro Durchlauf.
Für $1 können Sie dieses Modell ungefähr 16 Mal ausführen.
MiniMax Speech 2.8 Turbo is a high-quality text-to-speech model that transforms written text into natural, expressive audio. With support for multiple voice presets, emotional tones, and fine-grained audio controls, it delivers broadcast-ready speech synthesis for any application.
Rich voice library Choose from 17+ preset voices spanning different genders, ages, and speaking styles — or use your own custom-trained voice.
Expressive interjections Add natural human sounds like (laughs), (sighs), (coughs), (gasps), and more directly in your text for lifelike delivery.
Emotion control Set the emotional tone of the speech — happy, calm, or other moods — to match your content.
Pronunciation customization Define custom pronunciations for brand names, acronyms, or specialized terms using the pronunciation dictionary.
Full audio control Fine-tune speed, volume, pitch, sample rate, bitrate, channel, and output format for production-ready results.
| Parameter | Required | Description |
|---|---|---|
| text | Yes | The text to convert to speech. Supports interjections like (laughs), (sighs), (coughs) |
| voice_id | Yes | Voice preset or custom voice ID (see Available Voices below) |
| speed | No | Speech speed multiplier (default: 1) |
| volume | No | Volume level (default: 1) |
| pitch | No | Pitch adjustment (default: 0) |
| emotion | No | Emotional tone: happy, calm, etc. |
| pronunciation_dict | No | Custom pronunciation mappings (e.g., Omg/Oh my god) |
| english_normalization | No | Improves number-reading performance in English text |
| sample_rate | No | Audio sample rate |
| bitrate | No | Audio bitrate |
| channel | No | Audio channel (mono/stereo) |
| format | No | Output format |
| language_boost | No | Boost specific language recognition |
Wise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl
You can also use a custom voice ID trained via MiniMax Voice Clone.
(laughs), (chuckle), (coughs), (clear-throat), (groans), (breath), (pant), (inhale), (exhale), (gasps), (sniffs), (sighs), (snorts), (burps), (lip-smacking), (humming), (hissing), (emm), (whistles), (sneezes), (crying), (applause)
| Metric | Cost |
|---|---|
| Per 1,000 characters | $0.06 |