Qwen3 Tts Voice Design
Playground
Try it on WavespeedAI!Qwen3 TTS Voice Design: Generate speech with custom voice characteristics described in natural language. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.
Features
Qwen3-TTS Voice Design
Qwen3-TTS Voice Design is a next-generation text-to-speech model that lets you design custom voices using natural language descriptions. Instead of selecting from preset voices, simply describe the voice you want — age, gender, tone, speaking style — and the model generates speech that matches your description.
Why Choose This?
-
Natural language voice control Describe your ideal voice in plain text (e.g., “a warm, friendly female voice with a slight British accent”) and the model creates it.
-
Unlimited voice variety No preset limits — create any voice character you can describe, from professional narrators to unique personas.
-
Multilingual support Generate speech in 10 languages: Chinese, English, German, Italian, Portuguese, Spanish, Japanese, Korean, French, and Russian.
-
Auto language detection Set language to “auto” and the model intelligently detects the language from your text.
Parameters
| Parameter | Required | Description |
|---|---|---|
| text | Yes | The text to convert to speech |
| voice_description | Yes | Natural language description of the desired voice |
| language | No | auto, Chinese, English, German, Italian, Portuguese, Spanish, Japanese, Korean, French, Russian (default: auto) |
Voice Description Examples
- “A young female voice, energetic and cheerful, speaking quickly with enthusiasm”
- “An elderly male narrator with a deep, calm, authoritative tone”
- “A professional newsreader voice, neutral and clear, with perfect pronunciation”
- “A warm, friendly customer service representative, patient and helpful”
- “A dramatic storyteller voice with expressive intonation and theatrical pauses”
How to Use
- Enter your text — write or paste the content you want to convert to speech.
- Describe your voice — use natural language to describe the voice characteristics you want (age, gender, tone, style, accent, etc.).
- Select language — choose the target language or use “auto” for automatic detection.
- Run — submit and download your audio file.
Pricing
| Text Length | Cost |
|---|---|
| Under 1,000 chars | $0.02 |
| 1,000+ chars | $0.02 per 1,000 characters |
Billing Rules
- Minimum charge: $0.02 (for texts under 1,000 characters)
- For longer texts: $0.02 × (character count / 1,000)
Best Use Cases
- Character Voices — Create unique voices for games, animations, or audiobooks without voice actors.
- Prototyping — Quickly test different voice styles before committing to production.
- Localization — Generate consistent voice styles across multiple languages.
- Accessibility — Convert text to speech with customized, natural-sounding voices.
- Content Creation — Produce voiceovers for videos, podcasts, and presentations.
Pro Tips
- Be specific in your voice description — include age, gender, emotional tone, speaking pace, and any accent preferences.
- Use descriptive adjectives: “warm”, “crisp”, “authoritative”, “playful”, “soothing”, etc.
- Mention the context if relevant (e.g., “suitable for a children’s audiobook” or “professional corporate presentation”).
- Test with short text first to fine-tune your voice description before generating longer content.
- Combine multiple attributes for more nuanced voices (e.g., “middle-aged, confident but approachable”).
Notes
- Voice descriptions work best when they are clear and detailed.
- The same voice description will produce consistent results across multiple generations.
- For best quality, match the language parameter to your text content.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/qwen3-tts/voice-design" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"language": "auto"
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| text | string | Yes | - | - | The text content to convert into speech |
| voice_description | string | Yes | - | - | Natural language description of the desired voice characteristics (e.g., 'a warm female voice with a gentle tone') |
| language | string | No | auto | auto, Chinese, English, German, Italian, Portuguese, Spanish, Japanese, Korean, French, Russian | Language of the speech output (use 'auto' for automatic detection) |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | string | Array of URLs to the generated content (empty when status is not completed). |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |