WaveSpeed.ai
首頁/探索/Speech Generation/wavespeed-ai/qwen3-tts/text-to-speech
text-to-audio

text-to-audio

Qwen3 TTS

wavespeed-ai/qwen3-tts/text-to-speech

Qwen3 TTS: Multi-language, multi-voice text-to-speech synthesis with style control. Supports 11 languages and 9 voice characters. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

Input

Idle

您的請求將花費 $0.02 每次運行。

使用 $1 您可以運行此模型大約 50 次。

示例查看全部

README

Qwen3-TTS Text-to-Speech

Qwen3-TTS Text-to-Speech is a high-quality text-to-speech model with a curated selection of preset voices. Choose from 9 distinct voices spanning different genders and speaking styles, with optional style instructions to fine-tune the delivery.

Why Choose This?

  • Curated voice library 9 preset voices with distinct personalities — from professional narrators to friendly conversational tones.

  • Style instruction support Guide the speaking style with natural language instructions for customized delivery.

  • Auto language detection Set language to "auto" and the model intelligently detects the language from your text.

  • Simple and fast Straightforward interface — select a voice, enter text, and generate.

Parameters

ParameterRequiredDescription
textYesThe text to convert to speech
languageYesLanguage code or "auto" for automatic detection
voiceYesPreset voice to use (see Available Voices below)
style_instructionNoNatural language guidance for speaking style

Available Voices

VoiceDescription
VivianFemale voice
SerenaFemale voice
Ono_AnnaFemale voice
SoheeFemale voice
Uncle_FuMale voice
DylanMale voice
EricMale voice
RyanMale voice
AidenMale voice

Style Instruction Examples

  • "Speak slowly and calmly, like a meditation guide"
  • "Energetic and enthusiastic, like a sports announcer"
  • "Professional and clear, suitable for corporate presentations"
  • "Warm and friendly, like talking to a close friend"

How to Use

  1. Enter your text — write or paste the content you want to convert to speech.
  2. Select language — choose the target language or use "auto" for automatic detection.
  3. Choose a voice — select from the 9 available preset voices.
  4. Add style instruction (optional) — describe how you want the voice to sound.
  5. Run — submit and download your audio file.

Pricing

Text LengthCost
Under 1,000 chars$0.02
1,000+ chars$0.02 per 1,000 characters

Billing Rules

  • Minimum charge: $0.02 (for texts under 1,000 characters)
  • For longer texts: $0.02 × (character count / 1,000)

Best Use Cases

  • Video Voiceovers — Generate professional narration for YouTube, ads, or explainer videos.
  • Audiobook Production — Convert manuscripts into natural-sounding narration.
  • Podcasts & Broadcasting — Create consistent voice content without recording equipment.
  • E-learning & Training — Produce clear, engaging audio for educational materials.
  • Accessibility — Convert written content to audio for visually impaired users.

Pro Tips

  • Try different voices to find the best match for your content type.
  • Use style_instruction to adjust tone without changing the voice itself.
  • Match female voices (Vivian, Serena, Ono_Anna, Sohee) for softer content; male voices (Uncle_Fu, Dylan, Eric, Ryan, Aiden) for authoritative content.
  • Test with short text first to preview how the voice sounds before generating longer content.

Related Models

Notes

  • All 9 voices are optimized for natural, clear speech output.
  • Style instructions work best when they describe emotion, pace, or tone rather than technical audio settings.
  • For best quality, match the language parameter to your text content.