Inworld 1.5 Mini | Text-To-Speech With 56+ Multilingual Voices

Inworld 1.5 Mini Text-to-Speech

Inworld 1.5 Mini is a lightweight, ultra-affordable text-to-speech model that converts written text into natural speech. It offers the same voice selection, speaking rate, and expressiveness controls as the Max model — at half the cost. Perfect for high-volume workflows, prototyping, and budget-conscious production.

Need higher quality? Try Inworld 1.5 Max Text-to-Speech

Why Choose This?

Ultra-low cost Just $0.005 per 1,000 characters — the most affordable option for text-to-speech at scale.
Voice selection Choose from a library of distinct voice identities to match your brand, character, or use case.
Speaking rate control Adjust the speed of speech to suit narration, dialogue, announcements, or any delivery style.
Temperature control Fine-tune expressiveness — lower values for consistent delivery; higher values for more dynamic, varied speech.
Fast processing Lightweight architecture delivers quick turnaround, ideal for real-time or high-volume pipelines.

Parameters

Parameter	Required	Description
text	Yes	The text content to convert to speech
voice_id	No	Voice preset to use (e.g., Hades)
speaking_rate	No	Speed of speech (default: 1)
temperature	No	Expressiveness level (default: 1)

How to Use

Enter your text — type or paste the content you want converted to speech.
Select a voice — choose a voice preset from the voice_id dropdown.
Adjust speaking rate — slide to control how fast or slow the speech is delivered.
Adjust temperature — slide to control the expressiveness and variation in delivery.
Run — submit and download the generated audio.

Pricing

Characters	Cost
Up to 1,000	$0.005
Up to 2,000	$0.010
Up to 5,000	$0.025
Up to 10,000	$0.050

Billing Rules

Rate: $0.005 per 1,000 characters
Rounding: character count is rounded up to the next 1,000

Best Use Cases

High-Volume Production — Generate large batches of audio at minimal cost.
Prototyping & Testing — Quickly preview voiceovers before committing to final production.
Chatbots & Virtual Assistants — Add voice output to conversational AI at scale.
Content Accessibility — Convert written content to audio affordably for wider audiences.
Game & App Dialogue — Generate character voice lines for interactive experiences on a budget.

Pro Tips

Use Mini for drafting and iteration, then switch to Max for final production if higher quality is needed.
Keep speaking_rate around 1 for natural pacing; adjust lower for dramatic reads, higher for quick announcements.
Lower temperature gives more predictable, consistent output — great for automated systems.
Break long texts into logical paragraphs for better pacing and natural pauses.

Notes

Text is the only required field.
Billing is based on character count, rounded up to the nearest 1,000.
For maximum voice quality, consider Inworld 1.5 Max.

Inworld 1.5 Mini delivers high-quality text-to-speech synthesis with 56+ multilingual voices, adjustable speaking rate, and natural-sounding audio output. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

BeispieleAlle anzeigen

README