InfiniteTalk API
WaveSpeedAI InfiniteTalk — drive any character image with an audio file to produce a lip-synced, multi-shot talking video. No avatar library, no per-character training.
About the InfiniteTalk API
What InfiniteTalk does, how it fits in the WaveSpeedAI model lineup, and why teams reach for it.
InfiniteTalk is a video generation model from WaveSpeedAI, available through the WaveSpeedAI REST API. WaveSpeedAI InfiniteTalk — drive any character image with an audio file to produce a lip-synced, multi-shot talking video. No avatar library, no per-character training.
The InfiniteTalk family on WaveSpeedAI ships 8 REST endpoints covering Digital-Human workflow. Each variant carries its own pricing, parameter knobs, and example outputs — pick the one that matches your input modality and production constraints, or call several from the same API key to compose multi-step pipelines.
Run InfiniteTalk through the same API key, billing account, and rate-limit envelope you use for the other 1,000+ AI models on WaveSpeedAI. No separate vendor setup, no per-provider SDKs, no per-vendor rate-limit envelopes — one integration covers everything from text-to-image and text-to-video through audio synthesis, 3D generation, upscaling, and editing.
All InfiniteTalk API endpoints
8 endpoints available now on WaveSpeedAI — pick the variant that matches your workflow.

Video To Video Multi (Fast)
InfiniteTalk fast video-to-video multi converts a video and two audio inputs into multi-character talking or singing videos. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Video To Video (Fast)
Audio-driven infinitetalk-fast turns one video plus audio into realistic talking or singing videos with lip-sync. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Video To Video Multi
InfiniteTalk Video-to-Video Multi converts a video and two audio inputs into multi-character talking or singing videos at up to 720p. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Multi (Fast)
InfiniteTalk fast multi converts a single image and two audio inputs into multi-character talking or singing videos. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Multi
InfiniteTalk Multi converts a single image and two audio inputs into multi-character talking or singing videos at up to 720p. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Infinitetalk Fast (Fast)
InfiniteTalk fast converts one photo + audio into audio-driven talking or singing avatar videos (Image-to-Video), up to 10 minutes. Ready-to-use REST API, no coldstarts, affordable pricing.

Video To Video
Audio-driven InfiniteTalk turns one video plus audio into realistic talking or singing videos with lip-sync in 480p or 720p. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Infinitetalk
InfiniteTalk converts one photo + audio into audio-driven talking or singing avatar videos (Image-to-Video), up to 10 minutes, 720p tier $0.30/5s. Ready-to-use REST API, no coldstarts, affordable pricing.
See InfiniteTalk in action
Real outputs generated by the InfiniteTalk API. Hover any video to preview, click to open the full-size viewer.
How to use the InfiniteTalk API
Four steps from signup to a finished generation. Full Python, Node.js, and cURL examples are in the API section below.
- 1
Get an API key
Sign up for a WaveSpeedAI account and copy your API key from the dashboard. New accounts come with free starter credits — enough to run the playground a few dozen times before billing kicks in.
- 2
Submit a prediction
POST your input as JSON to https://api.wavespeed.ai/api/v3/wavespeed-ai/infinitetalk. The endpoint returns a prediction id immediately — generations are async so you don't hold an open connection during inference.
- 3
Poll for completion
GET https://api.wavespeed.ai/api/v3/predictions/{request_id}/result every 1-2 seconds. The response includes a status field; keep polling until it flips from "queued" or "processing" to "completed".
- 4
Read the output URL
Once status is "completed", read the URL from data.outputs[0]. The URL points to your generated media on the WaveSpeedAI CDN — image, video, audio, or 3D file depending on the InfiniteTalk variant you called.
What you can build with InfiniteTalk
Common workflows developers and creators use the InfiniteTalk API for.
Talking-head video from audio
Upload a character image + an audio track — get a lip-synced video of that character delivering the audio. Works for real photos, illustrations, and AI-generated characters.
Multilingual narrator videos
Localize the same character across multiple languages — same brand mascot, different voiceovers, consistent visual identity.
Podcast-to-video conversions
Turn podcast episodes into video with a host character — significantly cheaper than recorded video production and easier to update than re-recording.
Localized voiceover videos
For global content teams — keep the visual asset, swap the voiceover, and let InfiniteTalk re-sync lips to the new language.
Brand mascot videos
Animate any branded character (illustrated mascot, custom avatar, AI-generated character) speaking marketing copy — no per-character avatar training required.
Tips for prompting InfiniteTalk
Practical advice for getting better outputs from InfiniteTalk — drawn from the patterns that work across video models in production pipelines.
Be specific about camera moves
Mention concrete cinematography vocabulary — orbit, dolly-in, push-in, pan-left, crane shot, handheld follow. Generic prompts produce static or arbitrary camera choices; named camera moves map directly to motion intent in the model's training data and dramatically improve shot quality.
Anchor character identity with reference images
If your prompt depends on a specific person, character, or product, upload a reference image alongside the prompt. Without a reference, identity drifts across frames and across shots — the same character ends up looking like a slightly different person each generation.
Describe lighting and time of day
Lighting cues like 'golden hour, soft warm directional light' or 'overcast diffused light, slate-grey sky' improve quality and consistency far more than vague quality modifiers. Lighting is one of the strongest priors the model conditions on.
Use negative prompts to suppress common failure modes
Useful negatives for video: 'frame flicker, motion blur, watermark, text artifacts, distorted hands, low resolution, jpeg compression'. Negative prompts cost nothing and noticeably reduce the rate of generations you'd otherwise re-roll.
Pick the shortest duration that captures your beat
Most prompts work best at 5-8 seconds. Longer clips amplify temporal inconsistencies (subject morphing, environment drift). If you need a 20-second sequence, generate three 6-8 second clips and edit them together — quality stays higher than one long generation.
Match aspect ratio to platform up front
9:16 for TikTok / Reels / Shorts, 16:9 for landscape feeds and YouTube, 1:1 for post grids. Models train slightly differently per aspect ratio — cropping a 16:9 to 9:16 after the fact loses both fidelity and the composition the model intended.
InfiniteTalk API pricing
Pricing is per-output. The final charge scales with the parameters you set in each variant's playground (resolution, duration, output count, references).
| Endpoint | Type | Starting price |
|---|---|---|
| wavespeed-ai/infinitetalk-fast/video-to-video-multi | digital-human | $0.075 |
| wavespeed-ai/infinitetalk-fast/video-to-video | digital-human | $0.075 |
| wavespeed-ai/infinitetalk/video-to-video-multi | digital-human | $0.15 |
| wavespeed-ai/infinitetalk-fast/multi | digital-human | $0.075 |
| wavespeed-ai/infinitetalk/multi | digital-human | $0.15 |
| wavespeed-ai/infinitetalk-fast | digital-human | $0.075 |
| wavespeed-ai/infinitetalk/video-to-video | digital-human | $0.15 |
| wavespeed-ai/infinitetalk | digital-human | $0.15 |
Call the InfiniteTalk API
Sign up for an API key at wavespeed.ai/accesskey, then submit a prediction via REST. The playground generates ready-to-paste samples for any combination of inputs.
HTTP example
# 1. Submit a prediction
curl -X POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/infinitetalk" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $WAVESPEED_API_KEY" \
-d '{}'
# 2. Poll the result until status = "completed"
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
-H "Authorization: Bearer $WAVESPEED_API_KEY"
# Read the output URL from data.outputs[0].Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY
const result = await client.run("wavespeed-ai/infinitetalk", {});
console.log(result.outputs[0]); // → URL of the generated outputPython example
# pip install wavespeed
import wavespeed
output = wavespeed.run(
"wavespeed-ai/infinitetalk",
{}
)
print(output["outputs"][0]) # → URL of the generated outputInfiniteTalk vs alternatives
When to pick InfiniteTalk over similar models on WaveSpeedAI.
InfiniteTalk vs Stock avatar libraries (HeyGen-style)
Stock-avatar tools limit you to their library of pre-trained avatars. InfiniteTalk accepts any character image — your own brand mascot, an AI-generated character, an illustrated host — without per-character setup or training.
InfiniteTalk vs Wan 2.2 Speech-to-Video
Both are audio-driven. Speech-to-Video is part of the broader Wan 2.2 toolkit. InfiniteTalk is specialized for lip-sync quality, multi-shot composition, and identity preservation across longer audio.
InfiniteTalk vs ElevenLabs voice tools
ElevenLabs is voice-only — generates or clones audio. InfiniteTalk is the video layer on top: pair an ElevenLabs voiceover with a character image to get a full lip-synced video.
InfiniteTalk API — Frequently asked questions
Pricing, license, integration — common questions about running InfiniteTalk on WaveSpeedAI.
What is the InfiniteTalk API?
InfiniteTalk is a WaveSpeedAI video generation model exposed as a REST API on WaveSpeedAI. WaveSpeedAI InfiniteTalk — drive any character image with an audio file to produce a lip-synced, multi-shot talking video. No avatar library, no per-character training. You can call it programmatically or try it from the playground linked above.
How do I call the InfiniteTalk API?
Sign up for a WaveSpeedAI account, copy your API key from /accesskey, then POST to https://api.wavespeed.ai/api/v3/wavespeed-ai/infinitetalk with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to "completed", then read the output URL from data.outputs[0]. Full Python / Node.js / cURL examples are above.
How much does the InfiniteTalk API cost?
InfiniteTalk starts at $0.075 per run. The exact cost scales with the parameters you set (resolution, duration, output count, references). The live cost preview next to the Generate button in the playground shows the exact price for your current input.
Which InfiniteTalk variants are available?
WaveSpeedAI hosts 8 InfiniteTalk endpoints: wavespeed-ai/infinitetalk-fast/video-to-video-multi, wavespeed-ai/infinitetalk-fast/video-to-video, wavespeed-ai/infinitetalk/video-to-video-multi, wavespeed-ai/infinitetalk-fast/multi, wavespeed-ai/infinitetalk/multi, wavespeed-ai/infinitetalk-fast, wavespeed-ai/infinitetalk/video-to-video, wavespeed-ai/infinitetalk. Each variant has its own playground page and pricing.
Can I use InfiniteTalk outputs commercially?
Commercial usage rights follow the WaveSpeedAI model license. Most WaveSpeedAI models permit commercial output use; see each model's playground page for the specific license summary, and WaveSpeedAI's Terms of Service for platform-level conditions.
Why use InfiniteTalk on WaveSpeedAI instead of going direct?
One API key + one billing account across InfiniteTalk AND 1,000+ other AI models from other providers. No per-vendor SDK setup, no separate rate-limit envelopes, no rewrite-per-vendor integration code. Pricing is typically at parity with or below WaveSpeedAI's direct API.
About WaveSpeedAI
The team behind InfiniteTalk and the broader WaveSpeedAI model lineup on WaveSpeedAI.
WaveSpeedAI runs an inference platform that hosts 1,000+ AI models from every major provider — ByteDance, Google, OpenAI, Alibaba, Kuaishou, ElevenLabs, and dozens of independent labs — behind one API key, one billing account, and one rate-limit envelope. WaveSpeedAI also ships first-party models (Image / Video Upscalers, Watermark Removers, Animate, InfiniteTalk) tuned for production pipelines.
Start building with InfiniteTalk on WaveSpeedAI
Free starter credits on signup. One API key across 1,000+ AI models from WaveSpeedAI and every other provider.