Omnivoice Voice Clone

OmniVoice Voice Clone clones any voice from a short audio sample and generates natural speech in that voice. Upload a voice reference clip, provide the text you want spoken, and the model delivers high-quality cloned speech that matches the tone, style, and character of the original speaker.

Why Choose This?

High-fidelity voice cloning Captures the unique tone, cadence, and character of any voice from a short reference clip.
Natural speech output Generates fluid, human-sounding speech that closely matches the reference speaker's style.
Reference text support Optionally provide the transcript of the reference audio to improve cloning accuracy.
Speed control Adjust the playback speed of the generated speech to match your pacing needs.

Parameters

Parameter	Required	Description
text	Yes	The text you want the cloned voice to speak.
audio	Yes	Reference audio clip of the voice to clone (URL, file upload, or microphone recording).
reference_text	No	Transcript of the reference audio. Improves cloning accuracy when provided.
speed	No	Playback speed of the generated speech. Default: 1.

How to Use

Enter your text — type what you want the cloned voice to say.
Upload the reference audio — provide a clear voice sample via URL, file upload, or microphone recording.
Add reference text (optional) — provide the transcript of the reference clip for better accuracy.
Set speed (optional) — adjust the speaking rate if needed.
Submit — generate and download your cloned voice audio.

Pricing

Under 100 characters: flat $0.005 per generation
100+ characters: $0.00005 per character (i.e. $0.005 per 100 characters)

Examples

Text Length	Cost
50 chars	$0.005
100 chars	$0.005
500 chars	$0.025
1000 chars	$0.050

Best Use Cases

Content creation — Generate voiceovers in a consistent cloned voice for videos, podcasts, and social media.
Dubbing & localization — Clone a speaker's voice for use in translated or localized audio content.
Audiobook production — Produce narration in a specific voice without booking studio time.
Personal voice preservation — Clone and preserve a unique voice for future use.
Developer integrations — Embed voice cloning into apps, platforms, and automated speech workflows.

Pro Tips

Use a clear, high-quality reference audio clip with minimal background noise for the most accurate clone.
A reference clip of 6–30 seconds with natural, expressive speech produces the best results.
Providing reference_text significantly improves cloning accuracy — always include it if you know the transcript.
For long text outputs, break content into natural sentence chunks for more controlled pacing.

Notes

Both text and audio are required fields.
Pricing is based on text length: flat $0.005 for under 100 characters, then $0.00005 per character beyond that.
Ensure audio URLs are publicly accessible if using a link rather than a direct upload.
Please ensure your content complies with WaveSpeed AI's usage policies.

Omnivoice Voice Clone API — Quick start

Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/wavespeed-ai/omnivoice/voice-clone with your input as JSON. The endpoint returns a prediction id. Start polling the result endpoint around every 2 seconds, increase the interval for long-running tasks, and stop on any terminal status. On completed, read output values from data.outputs. Examples for Omnivoice Voice Clone below.

HTTP example

set -euo pipefail

: "${WAVESPEED_API_KEY:?Set WAVESPEED_API_KEY}"

REQUEST_BODY=$(cat <<'JSON'
{
    "text": "A clear example input",
    "audio": "https://interactive-examples.mdn.mozilla.net/media/cc0-audio/t-rex-roar.mp3",
    "speed": 1
}
JSON
)

# 1. Submit the prediction.
SUBMIT_RESPONSE=$(curl --silent --show-error --fail-with-body \
  -X POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/omnivoice/voice-clone" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d "$REQUEST_BODY")

TASK=$(printf '%s' "$SUBMIT_RESPONSE" | jq 'if has("data") then .data else . end')
PREDICTION_ID=$(printf '%s' "$TASK" | jq -r '.id')
if [ -z "$PREDICTION_ID" ] || [ "$PREDICTION_ID" = "null" ]; then
  printf 'Submission response did not contain a prediction id
' >&2
  exit 1
fi
RESULT_URL=$(printf '%s' "$TASK" | jq -r '.urls.get // empty')
if [ -z "$RESULT_URL" ]; then
  RESULT_URL="https://api.wavespeed.ai/api/v3/predictions/$PREDICTION_ID/result"
fi

# 2. Poll until the prediction finishes.
while true; do
  RESPONSE=$(curl --silent --show-error --fail-with-body "$RESULT_URL" \
    -H "Authorization: Bearer $WAVESPEED_API_KEY")
  RESULT=$(printf '%s' "$RESPONSE" | jq 'if has("data") then .data else . end')
  STATUS=$(printf '%s' "$RESULT" | jq -r '.status')
  case "$STATUS" in
    completed) printf '%s\n' "$RESULT" | jq '.outputs'; break ;;
    failed|cancelled|timeout) printf '%s\n' "$RESULT" | jq . >&2; exit 1 ;;
    created|processing) sleep 2 ;;
    *) printf 'Unexpected status: %s
' "$STATUS" >&2; exit 1 ;;
  esac
done

Node.js example

const submitUrl = "https://api.wavespeed.ai/api/v3/wavespeed-ai/omnivoice/voice-clone";
const apiKey = process.env.WAVESPEED_API_KEY;
if (!apiKey) throw new Error('Set WAVESPEED_API_KEY');

async function requestJson(url, options = {}) {
  const response = await fetch(url, options);
  if (!response.ok) throw new Error(await response.text());
  return response.json();
}

// 1. Submit the prediction.
const body = await requestJson(submitUrl, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${apiKey}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
        "text": "A clear example input",
        "audio": "https://interactive-examples.mdn.mozilla.net/media/cc0-audio/t-rex-roar.mp3",
        "speed": 1
}),
});
const task = body.data ?? body;
if (!task.id) throw new Error("Submission response did not contain a prediction id");
const resultUrl = task.urls?.get ||
  `https://api.wavespeed.ai/api/v3/predictions/${task.id}/result`;

// 2. Poll until the prediction finishes.
while (true) {
  const resultBody = await requestJson(resultUrl, {
    headers: { "Authorization": `Bearer ${apiKey}` },
  });
  const result = resultBody.data ?? resultBody;
  if (result.status === "completed") {
    console.log(result.outputs);
    break;
  }
  if (["failed", "cancelled", "timeout"].includes(result.status)) throw new Error(JSON.stringify(result));
  if (!["created", "processing"].includes(result.status)) throw new Error("Unexpected status: " + result.status);
  await new Promise(resolve => setTimeout(resolve, 2000));
}

Python example

import json
import os
import time
from urllib.request import Request, urlopen

api_key = os.environ["WAVESPEED_API_KEY"]
headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
payload = {
    "text": "A clear example input",
    "audio": "https://interactive-examples.mdn.mozilla.net/media/cc0-audio/t-rex-roar.mp3",
    "speed": 1
}

def request_json(url, data=None):
    request = Request(url, data=data, headers=headers, method="POST" if data else "GET")
    with urlopen(request) as response:
        return json.load(response)

# 1. Submit the prediction.
body = request_json("https://api.wavespeed.ai/api/v3/wavespeed-ai/omnivoice/voice-clone", json.dumps(payload).encode())
task = body.get("data", body)
if not task.get("id"):
    raise RuntimeError("Submission response did not contain a prediction id")
result_url = task.get("urls", {}).get("get") or f"https://api.wavespeed.ai/api/v3/predictions/{task['id']}/result"

# 2. Poll until the prediction finishes.
while True:
    result_body = request_json(result_url)
    result = result_body.get("data", result_body)
    status = result.get("status")
    if status == "completed":
        print(result.get("outputs", []))
        break
    if status in {"failed", "cancelled", "timeout"}:
        raise RuntimeError(result)
    if status not in {"created", "processing"}:
        raise RuntimeError(f"Unexpected status: {status}")
    time.sleep(2)

Omnivoice Voice Clone API — Frequently asked questions

What is the Omnivoice Voice Clone API?

Omnivoice Voice Clone is a WaveSpeedAI model for AI inference, exposed as a REST API on WaveSpeedAI. OmniVoice Voice Clone clones any voice from a short 3-10 second audio sample. Supports 600+ languages with zero-shot voice cloning. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.

How do I call the Omnivoice Voice Clone API?

POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID. Poll the result endpoint starting around every 2 seconds, increase the interval for long-running tasks, and stop on any terminal status. The playground generates production-oriented Python, JavaScript, and cURL examples with timeouts, transient-error handling, and safe GET retries. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/wavespeed-ai/omnivoice-voice-clone.

How much does Omnivoice Voice Clone cost per run?

Omnivoice Voice Clone starts at $0.005 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.

What inputs does Omnivoice Voice Clone accept?

Key inputs: `audio`, `reference_text`, `speed`, `text`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/wavespeed-ai/omnivoice-voice-clone.

How long does Omnivoice Voice Clone take to generate?

Median end-to-end generation time on WaveSpeedAI is around 6 seconds per request, based on recent successful runs. Queue time varies with global demand; live status is visible in the prediction record.

Can I use Omnivoice Voice Clone outputs commercially?

Commercial usage rights depend on the model's license, set by its provider (WaveSpeedAI). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.

サンプルすべて表示

関連モデル

README