Home/Explore/Speech Generation/minimax/speech-2.5-hd-preview

text-to-audio

minimax/speech-2.5-hd-preview

MiniMax's high-definition text-to-speech model, Compared to Speech 02 released in May, Speech 2.5 has three major breakthroughs: stronger multilingual expressiveness, more accurate voice replication, and broader coverage with 40 languages. Your request will cost $0.06 per 1000 characters.

Doc
Hint: Desired voice ID. Use a voice ID you have trained (https://wavespeed.ai/models/minimax/voice-clone), or one of the following system voice IDs: Wise_Woman, Friendly_Person, Inspirational_girl, Deep_Voice_Man, Calm_Woman, Casual_Guy, Lively_Girl, Patient_Man, Young_Knight, Determined_Man, Lovely_Girl, Decent_Boy, Imposing_Manner, Elegant_Man, Abbess, Sweet_Girl_2, Exuberant_Girl
This parameter supports English text normalization, which improves performance in number-reading scenarios.

Idle

Your request will cost $0.006 per run.

For $1 you can run this model approximately 166 times.

ExamplesView all

README

MiniMax Speech 2.5 HD Preview

MiniMax's high-definition text-to-speech model with natural pronunciation and clear articulation. Features multiple voice options, adjustable speed, volume, and pitch controls for professional-grade audio generation.

Features

Three major breakthroughs: stronger multilingual performance, more lifelike tone, and 40 languages covered.

Leapfrogging multilingual performance

  • Chinese is the world's strongest, English and other languages have been comprehensively improved in terms of accuracy, similarity, and natural rhythm, surpassing the previous generation Speech 02.
  • English similarity has been significantly improved, and 40 languages can be switched at will, making it no longer "mechanical" in commercial meetings, daily conversations, and English podcasts.

Lifelike tone replication

  • Across languages, accents, styles, and emotions, with industry-leading precision in detail.
  • The model can still achieve "voice"-like realism in extreme scenarios, such as cross-language accent preservation, regional accent preservation, and special age voice replication.

40 languages are now supported

  • Diversified high-quality audio library, globalized and unobstructed.
  • New additions include Bulgarian, Danish, Hebrew, Malay, Persian, Slovak, Swedish, Croatian, Filipino, Hungarian, Norwegian, Slovenian, Catalan, Nynorsk, Tamil, Afrikaans... ...
  • Cross-border e-commerce, overseas customer service, and localized marketing are now easier than ever with globalized content creation at your fingertips.