Vidu Contest
WaveSpeed.ai
/탐색/Speech Generation/inworld/1.5-max/text-to-speech
text-to-audio

text-to-audio

Inworld 1.5 Max

inworld/1.5-max/text-to-speech

Inworld 1.5 Max delivers premium text-to-speech synthesis with 56+ multilingual voices, adjustable speaking rate, and high-fidelity natural-sounding audio output. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Input

Idle

이 요청에는 $0.01 실행당가 필요합니다.

$1으로 이 모델을 약 100회 실행할 수 있습니다.

예시전체 보기

README

Inworld 1.5 Max Text-to-Speech

Inworld 1.5 Max is a high-quality text-to-speech model that converts written text into natural, expressive speech. Choose from a variety of voice presets, fine-tune speaking rate and expressiveness with simple controls, and generate professional-grade audio in seconds — ideal for IVR systems, voiceovers, content creation, and accessibility.

Why Choose This?

  • Natural-sounding voices Multiple voice presets with realistic intonation, pacing, and emotion for lifelike speech output.

  • Voice selection Choose from a library of distinct voice identities to match your brand, character, or use case.

  • Speaking rate control Adjust the speed of speech to suit narration, dialogue, announcements, or any delivery style.

  • Temperature control Fine-tune expressiveness — lower values for consistent, predictable delivery; higher values for more dynamic, varied speech.

  • Ultra-low cost Just $0.01 per 1,000 characters — affordable even at scale.

Parameters

ParameterRequiredDescription
textYesThe text content to convert to speech
voice_idNoVoice preset to use (e.g., Elizabeth)
speaking_rateNoSpeed of speech (default: 1)
temperatureNoExpressiveness level (default: 1)

How to Use

  1. Enter your text — type or paste the content you want converted to speech.
  2. Select a voice — choose a voice preset from the voice_id dropdown.
  3. Adjust speaking rate — slide to control how fast or slow the speech is delivered.
  4. Adjust temperature — slide to control the expressiveness and variation in delivery.
  5. Run — submit and download the generated audio.

Pricing

CharactersCost
Up to 1,000$0.01
Up to 2,000$0.02
Up to 5,000$0.05
Up to 10,000$0.10

Billing Rules

  • Rate: $0.01 per 1,000 characters
  • Rounding: character count is rounded up to the next 1,000

Best Use Cases

  • IVR & Phone Systems — Generate professional call menus, hold messages, and automated responses.
  • Video Voiceovers — Add narration to marketing videos, tutorials, and presentations.
  • Content Creation — Convert blog posts, articles, or scripts into audio for podcasts and social media.
  • Accessibility — Provide audio versions of written content for visually impaired users.
  • Game & App Dialogue — Create character voices for interactive experiences and virtual assistants.

Pro Tips

  • Keep speaking_rate around 1 for natural-sounding narration; lower for dramatic reads, higher for fast announcements.
  • Use lower temperature for consistent, predictable voiceovers (e.g., IVR); higher temperature for more expressive character dialogue.
  • Break long texts into logical paragraphs for better pacing and natural pauses.
  • Test different voice_id options to find the best match for your brand or character.

Notes

  • Text is the only required field.
  • Billing is based on character count, rounded up to the nearest 1,000.
  • Very long texts may take slightly longer to process.