Giảm 50% mô hình Vidu Q3 & Q3 Pro · Chỉ trên WaveSpeedAI | 20/5 – 2/6

Multilingual V2

elevenlabs /

ElevenLabs Multilingual V2 is a multilingual text-to-speech model; cost $0.1 per 1000 characters. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio
Input
This parameter supports English text normalization, which improves performance in number-reading scenarios.

Idle

$0.1per run·~10 / $1

ExamplesView all

Related Models

README

ElevenLabs — Multilingual V2 Text-to-Speech

Multilingual V2 converts written text into natural, expressive speech across multiple languages. It delivers clear pronunciation, smooth pacing, and lifelike tone—ideal for voiceovers, narration, learning content, product videos, and global customer support. See the list here.

Key Features

  • High naturalness with humanlike intonation and timing
  • Strong multilingual support and improved accent handling
  • Tunable delivery via similarity and stability
  • Speaker Boost for clearer English numerals, dates, and units

Pricing

  • $0.1 per 1,000 characters
  • If the input length is less than 1000 characters, it will be counted as 1000 characters to pay.

How to Use

  1. Enter your script in the text field.
  2. Choose a voice_id from the built-in catalog or your custom voices. See the voice list for options.
  3. Optional controls • similarity: 0–1 (higher = closer to the base voice timbre) • stability: 0–1 (higher = more consistent delivery) • use_speaker_boost: improves English number and unit reading
  4. Click Run to synthesize and preview your audio.

Notes

  • Use clear punctuation and split very long text into shorter segments for the most stable prosody.
  • voice_id must be valid; if you see a voice-ID error, pick one from the official list linked above.
  • Speaker Boost is especially helpful for financial, time, and measurement reads in English.
Accessibility:This website uses AI models provided by third parties.

Multilingual v2 API — Quick start

Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/elevenlabs/multilingual-v2 with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Multilingual v2 below.

HTTP example
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/elevenlabs/multilingual-v2" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{
    "voice_id": "Alice",
    "similarity": 1,
    "stability": 0.5,
    "use_speaker_boost": true
}'

# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# When status is "completed", read the output from data.outputs[0].
Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');

const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env

const result = await client.run("elevenlabs/multilingual-v2", {
        "voice_id": "Alice",
        "similarity": 1,
        "stability": 0.5,
        "use_speaker_boost": true
});

console.log(result.outputs[0]); // → URL of the generated output
Python example
# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "elevenlabs/multilingual-v2",
    {
    "voice_id": "Alice",
    "similarity": 1,
    "stability": 0.5,
    "use_speaker_boost": true
}
)

print(output["outputs"][0])  # → URL of the generated output

Multilingual v2 API — Frequently asked questions

What is the Multilingual v2 API?

Multilingual v2 is a ElevenLabs model for audio generation, exposed as a REST API on WaveSpeedAI. ElevenLabs Multilingual V2 is a multilingual text-to-speech model; cost $0.1 per 1000 characters. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.

How do I call the Multilingual v2 API?

POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/elevenlabs/elevenlabs-multilingual-v2.

How much does Multilingual v2 cost per run?

Multilingual v2 starts at $0.10 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.

What inputs does Multilingual v2 accept?

Key inputs: `similarity`, `stability`, `text`, `use_speaker_boost`, `voice_id`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/elevenlabs/elevenlabs-multilingual-v2.

How long does Multilingual v2 take to generate?

Average end-to-end generation time on WaveSpeedAI is around 6 seconds per request — measured across recent runs. Queue time scales with global demand; live status is visible in the prediction record.

Can I use Multilingual v2 outputs commercially?

Commercial usage rights depend on the model's license, set by its provider (ElevenLabs). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.