Sync Lipsync 3 Avatar Image to Talking Video API

Sync Lipsync 3 Avatar

Sync Lipsync 3 Avatar turns a single still image into a talking character video driven by an input audio track. It is designed for portrait avatars, illustrated characters, and animated-frame references that need natural lip synchronization.

Why Choose This?

Image-to-video avatar generation
Animate a still image into a talking video using only an image and an audio file.
Audio-driven timing
The generated video follows the duration and speech timing of the input audio.
Works with multiple visual styles
Supports realistic portraits, illustrations, and animated character frames.
Simple production workflow
No prompt or extra tuning is required; provide the reference image and audio track.
Standard video output
The generated video is returned as a URL in the standard WaveSpeed prediction response.

Parameters

Parameter	Required	Description
image	Yes	Input image URL. Use a clear face image in JPEG, PNG, or WebP format.
audio	Yes	Input audio URL. The output video follows the audio duration.

How to Use

Upload an image - Provide a clear portrait or character image with a visible face.
Upload audio - Provide the speech or singing audio that should drive the lip-sync.
Submit - Generate the talking avatar video.
Download - Use the returned video URL in your editing or publishing workflow.

Pricing

Pricing is $8.00 per minute of input audio, billed proportionally by exact audio duration.

Audio Duration	Price
10s	$1.33
11s	$1.47
30s	$4.00
60s	$8.00
90s	$12.00

Best Use Cases

Talking avatars - Create presenter or character videos from a single image.
Localized narration - Generate avatar videos from translated or re-recorded voice tracks.
Character dialogue - Animate illustrated or stylized character frames for dialogue scenes.
Marketing clips - Produce short spokesperson videos from static brand assets.
Education and demos - Create simple explainers, tutorials, or training clips without filming.

Pro Tips

Use a clear face image with visible mouth and eyes.
Avoid heavy occlusion, extreme angles, or very low-resolution images.
Use clean audio with minimal background noise for better lip-sync.
Trim silence before upload if you want tighter timing and lower cost.
Make sure image and audio URLs are publicly accessible.

Lipsync 3 Avatar API — Quick start

Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/sync/lipsync-3/avatar with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Lipsync 3 Avatar below.

HTTP example

# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/sync/lipsync-3/avatar" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{
    "image": "https://example.com/your-input.jpg",
    "audio": "https://example.com/your-audio.mp3"
}'

# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# When status is "completed", read the output from data.outputs[0].

Node.js example

// npm install wavespeed
const WaveSpeed = require('wavespeed');

const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env

const result = await client.run("sync/lipsync-3/avatar", {
        "image": "https://example.com/your-input.jpg",
        "audio": "https://example.com/your-audio.mp3"
});

console.log(result.outputs[0]); // → URL of the generated output

Python example

# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "sync/lipsync-3/avatar",
    {
    "image": "https://example.com/your-input.jpg",
    "audio": "https://example.com/your-audio.mp3"
}
)

print(output["outputs"][0])  # → URL of the generated output

Lipsync 3 Avatar API — Frequently asked questions

What is the Lipsync 3 Avatar API?

Lipsync 3 Avatar is a Sync model for talking-avatar generation, exposed as a REST API on WaveSpeedAI. Sync Lipsync 3 Avatar turns a single still image and an input audio track into a lip-synced talking character video, with natural mouth movement, facial animation, and stable avatar performance. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.

How do I call the Lipsync 3 Avatar API?

POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/sync/sync-lipsync-3-avatar.

How much does Lipsync 3 Avatar cost per run?

Lipsync 3 Avatar starts at $8.00 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.

What inputs does Lipsync 3 Avatar accept?

Key inputs: `image`, `audio`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/sync/sync-lipsync-3-avatar.

How do I get started with the Lipsync 3 Avatar API?

Sign up for a free WaveSpeedAI account to claim starter credits, copy your API key from /accesskey, then call the endpoint shown in the API tab of the playground. The playground also auto-generates a code sample in Python, JavaScript, or cURL for the parameters you've set.

Can I use Lipsync 3 Avatar outputs commercially?

Commercial usage rights depend on the model's license, set by its provider (Sync). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.

ExamplesView all

Related Models

README