Giảm 50% mô hình Vidu Q3 & Q3 Pro · Chỉ trên WaveSpeedAI | 20/5 – 2/6

Inworld 1.5 Max Text to Speech

inworld /

Inworld 1.5 Max delivers premium text-to-speech synthesis with 56+ multilingual voices, adjustable speaking rate, and high-fidelity natural-sounding audio output. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio
Input

Idle

$0.01per run·~100 / $1

ExamplesView all

Related Models

README

Inworld 1.5 Max Text-to-Speech

Inworld 1.5 Max is a high-quality text-to-speech model that converts written text into natural, expressive speech. Choose from a variety of voice presets, fine-tune speaking rate and expressiveness with simple controls, and generate professional-grade audio in seconds — ideal for IVR systems, voiceovers, content creation, and accessibility.

Why Choose This?

  • Natural-sounding voices Multiple voice presets with realistic intonation, pacing, and emotion for lifelike speech output.

  • Multilingual voice library 65+ voices across 14 languages including English, Chinese, Japanese, Korean, and more.

  • Speaking rate control Adjust the speed of speech to suit narration, dialogue, announcements, or any delivery style.

  • Temperature control Fine-tune expressiveness — lower values for consistent, predictable delivery; higher values for more dynamic, varied speech.

  • Affordable at scale Just $0.01 per 1,000 characters — professional quality at accessible pricing.

Parameters

ParameterRequiredDescription
textYesThe text content to convert to speech
voice_idNoVoice preset to use (see Available Voices below)
speaking_rateNoSpeed of speech (default: 1)
temperatureNoExpressiveness level (default: 1)

Available Voices

English

Voice IDDescription
AlexEnergetic and expressive mid-range male voice, with a mildly nasal quality
AshleyA warm, natural female voice
CraigOlder British male with a refined and articulate voice
DeborahGentle and elegant female voice
DennisMiddle-aged man with a smooth, calm and friendly voice
EdwardMale with a fast-talking, emphatic and streetwise tone
ElizabethProfessional middle-aged woman, perfect for narrations and voiceovers
HadesCommanding and gruff male voice, think an omniscient narrator or castle guard
JuliaQuirky, high-pitched female voice that delivers lines with playful energy
PixieHigh-pitched, childlike female voice with a squeaky quality
MarkEnergetic, expressive man with a rapid-fire delivery
OliviaYoung, British female with an upbeat, friendly tone
PriyaEven-toned female voice with an Indian accent
RonaldConfident, British man with a deep, gravelly voice
SarahFast-talking young adult woman, with a questioning and curious tone
ShaunFriendly, dynamic male voice great for conversations
TheodoreGravelly male voice, with a time-worn quality
TimothyLively, upbeat American male voice
WendyPosh, middle-aged British female voice
DominusRobotic, deep male voice with a menacing quality. Perfect for villains
HanaBright, expressive young female voice, perfect for storytelling and gaming
CliveBritish-accented male voice with a calm, cordial quality
CarterEnergetic, mature radio announcer-style male voice
BlakeRich, intimate male voice, perfect for audiobooks and romantic content
LunaCalm, relaxing female voice, perfect for meditations and sleep stories

Chinese

Voice IDDescription
YichenA calm, flat young adult male Chinese voice
XiaoyinA youthful Chinese female voice with a gentle, sweet voice
XinyiA Chinese woman with a neutral tone, perfect for narrations
JingAn energetic, fast-paced young Chinese female

Japanese

Voice IDDescription
AsukaFriendly, young adult Japanese female voice
SatoshiDramatic, expressive male Japanese voice filled with energy

Korean

Voice IDDescription
HyunwooYoung adult Korean male voice
MinjiEnergetic, friendly young Korean female voice
SeojunClear, deep mature Korean male voice
YoonaKorean woman with a gentle, soothing voice

French

Voice IDDescription
AlainDeep, smooth middle-aged male French voice. Composed and calm
HélèneMiddle-aged French woman, with a smooth, musical, and graceful voice
MathieuA French male voice carrying a nasal quality
ÉtienneCalm young adult French male

German

Voice IDDescription
JohannaA calm older German female with a low, smoky voice
JosefAn articulate German male voice with an announcer-like quality

Spanish

Voice IDDescription
DiegoSpanish-speaking male voice with a soothing, gentle quality
LupitaVibrant, energetic young Spanish-speaking female voice
MiguelA calm adult Spanish-speaking male voice, perfect for storytelling
RafaelMiddle-aged Spanish-speaking male with a deep, composed voice

Portuguese

Voice IDDescription
HeitorComposed Portuguese-speaking male voice with a neutral tone
MaitêMiddle-aged Portuguese-speaking female voice

Italian

Voice IDDescription
GianniDeep, smooth Italian male voice that speaks rapidly
OriettaCalm adult female Italian voice, with a soothing cadence

Dutch

Voice IDDescription
ErikOlder Dutch male voice with a weathered edge
KatrienDutch woman with an expressive voice
LennartA confident Dutch male voice. Calm and relaxed
LoreClear, calm Dutch female voice, great for narrations

Polish

Voice IDDescription
SzymonPolish adult male voice with a warm, friendly quality
WojciechA middle-aged Polish male voice

Russian

Voice IDDescription
SvetlanaSoft, high-pitched female voice, with a slightly breathy quality
ElenaClear, mid-range female voice, with a neutral, informational tone
DmitryDeep, gravelly male voice, with a commanding and narrative tone
NikolaiDeep, resonant male voice, with a clear, theatrical quality

Hindi

Voice IDDescription
RiyaProfessional and clean female voice, polished and approachable
ManojClear, professional Hindi male voice. Great for narrations

Hebrew

Voice IDDescription
YaelMid-range female Hebrew voice, suitable for narrations
OrenSteady male Hebrew voice, great for podcasts and voiceovers

Arabic

Voice IDDescription
NourPolished female Arabic voice with a friendly tone
OmarBright, confident Arabic male voice, great for announcements

How to Use

  1. Enter your text — type or paste the content you want converted to speech.
  2. Select a voice — choose a voice preset from the voice_id dropdown.
  3. Adjust speaking rate — slide to control how fast or slow the speech is delivered.
  4. Adjust temperature — slide to control the expressiveness and variation in delivery.
  5. Run — submit and download the generated audio.

Pricing

CharactersCost
Up to 1,000$0.01
Up to 2,000$0.02
Up to 5,000$0.05
Up to 10,000$0.10

Billing Rules

  • Rate: $0.01 per 1,000 characters
  • Rounding: character count is rounded up to the next 1,000

Best Use Cases

  • IVR & Phone Systems — Generate professional call menus, hold messages, and automated responses.
  • Video Voiceovers — Add narration to marketing videos, tutorials, and presentations.
  • Content Creation — Convert blog posts, articles, or scripts into audio for podcasts and social media.
  • Accessibility — Provide audio versions of written content for visually impaired users.
  • Game & App Dialogue — Create character voices for interactive experiences and virtual assistants.
  • Multilingual Content — Create audio content in 14 languages from a single API.

Pro Tips

  • Keep speaking_rate around 1 for natural-sounding narration; lower for dramatic reads, higher for fast announcements.
  • Use lower temperature for consistent, predictable voiceovers (e.g., IVR); higher temperature for more expressive character dialogue.
  • Break long texts into logical paragraphs for better pacing and natural pauses.
  • Test different voice_id options to find the best match for your brand or character.
  • Match voice language to your text language for best pronunciation and intonation.

Notes

  • Text is the only required field.
  • Billing is based on character count, rounded up to the nearest 1,000.
  • Very long texts may take slightly longer to process.

Related Models

Accessibility:This website uses AI models provided by third parties.

Inworld 1.5 Max Text To Speech API — Quick start

Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/inworld/inworld-1.5-max/text-to-speech with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Inworld 1.5 Max Text To Speech below.

HTTP example
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/inworld/inworld-1.5-max/text-to-speech" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{
    "voice_id": "Alex",
    "speaking_rate": 1,
    "temperature": 1
}'

# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# When status is "completed", read the output from data.outputs[0].
Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');

const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env

const result = await client.run("inworld/inworld-1.5-max/text-to-speech", {
        "voice_id": "Alex",
        "speaking_rate": 1,
        "temperature": 1
});

console.log(result.outputs[0]); // → URL of the generated output
Python example
# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "inworld/inworld-1.5-max/text-to-speech",
    {
    "voice_id": "Alex",
    "speaking_rate": 1,
    "temperature": 1
}
)

print(output["outputs"][0])  # → URL of the generated output

Inworld 1.5 Max Text To Speech API — Frequently asked questions

What is the Inworld 1.5 Max Text To Speech API?

Inworld 1.5 Max Text To Speech is a Inworld model for audio generation, exposed as a REST API on WaveSpeedAI. Inworld 1.5 Max delivers premium text-to-speech synthesis with 56+ multilingual voices, adjustable speaking rate, and high-fidelity natural-sounding audio output. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.

How do I call the Inworld 1.5 Max Text To Speech API?

POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/inworld/inworld-inworld-1.5-max-text-to-speech.

How much does Inworld 1.5 Max Text To Speech cost per run?

Inworld 1.5 Max Text To Speech starts at $0.010 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.

What inputs does Inworld 1.5 Max Text To Speech accept?

Key inputs: `speaking_rate`, `temperature`, `text`, `voice_id`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/inworld/inworld-inworld-1.5-max-text-to-speech.

How do I get started with the Inworld 1.5 Max Text To Speech API?

Sign up for a free WaveSpeedAI account to claim starter credits, copy your API key from /accesskey, then call the endpoint shown in the API tab of the playground. The playground also auto-generates a code sample in Python, JavaScript, or cURL for the parameters you've set.

Can I use Inworld 1.5 Max Text To Speech outputs commercially?

Commercial usage rights depend on the model's license, set by its provider (Inworld). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.