Seedance 2.0 15% छूट | Video Generator में बनाएँ →

Sync Lipsync 3 Avatar Image to Talking Video API

sync /

Sync Lipsync 3 Avatar turns a single still image and an input audio track into a lip-synced talking character video, with natural mouth movement, facial animation, and stable avatar performance. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

digital-human
Input

Drag & drop करें या upload के लिए click करें

preview

Drag & drop करें या upload के लिए click करें

Idle

$8per run

ExamplesView all

Related Models

README

Sync Lipsync 3 Avatar

Sync Lipsync 3 Avatar turns a single still image into a talking character video driven by an input audio track. It is designed for portrait avatars, illustrated characters, and animated-frame references that need natural lip synchronization.

Why Choose This?

  • Image-to-video avatar generation
    Animate a still image into a talking video using only an image and an audio file.

  • Audio-driven timing
    The generated video follows the duration and speech timing of the input audio.

  • Works with multiple visual styles
    Supports realistic portraits, illustrations, and animated character frames.

  • Simple production workflow
    No prompt or extra tuning is required; provide the reference image and audio track.

  • Standard video output
    The generated video is returned as a URL in the standard WaveSpeed prediction response.

Parameters

ParameterRequiredDescription
imageYesInput image URL. Use a clear face image in JPEG, PNG, or WebP format.
audioYesInput audio URL. The output video follows the audio duration.

How to Use

  1. Upload an image - Provide a clear portrait or character image with a visible face.
  2. Upload audio - Provide the speech or singing audio that should drive the lip-sync.
  3. Submit - Generate the talking avatar video.
  4. Download - Use the returned video URL in your editing or publishing workflow.

Pricing

Pricing is $8.00 per minute of input audio, billed proportionally by exact audio duration.

Audio DurationPrice
10s$1.33
11s$1.47
30s$4.00
60s$8.00
90s$12.00

Best Use Cases

  • Talking avatars - Create presenter or character videos from a single image.
  • Localized narration - Generate avatar videos from translated or re-recorded voice tracks.
  • Character dialogue - Animate illustrated or stylized character frames for dialogue scenes.
  • Marketing clips - Produce short spokesperson videos from static brand assets.
  • Education and demos - Create simple explainers, tutorials, or training clips without filming.

Pro Tips

  • Use a clear face image with visible mouth and eyes.
  • Avoid heavy occlusion, extreme angles, or very low-resolution images.
  • Use clean audio with minimal background noise for better lip-sync.
  • Trim silence before upload if you want tighter timing and lower cost.
  • Make sure image and audio URLs are publicly accessible.
Accessibility:This website uses AI models provided by third parties.

Lipsync 3 Avatar API — Quick start

Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/sync/lipsync-3/avatar with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Lipsync 3 Avatar below.

HTTP example
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/sync/lipsync-3/avatar" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{
    "image": "https://example.com/your-input.jpg",
    "audio": "https://example.com/your-audio.mp3"
}'

# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# When status is "completed", read the output from data.outputs[0].
Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');

const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env

const result = await client.run("sync/lipsync-3/avatar", {
        "image": "https://example.com/your-input.jpg",
        "audio": "https://example.com/your-audio.mp3"
});

console.log(result.outputs[0]); // → URL of the generated output
Python example
# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "sync/lipsync-3/avatar",
    {
    "image": "https://example.com/your-input.jpg",
    "audio": "https://example.com/your-audio.mp3"
}
)

print(output["outputs"][0])  # → URL of the generated output

Lipsync 3 Avatar API — Frequently asked questions

What is the Lipsync 3 Avatar API?

Lipsync 3 Avatar is a Sync model for talking-avatar generation, exposed as a REST API on WaveSpeedAI. Sync Lipsync 3 Avatar turns a single still image and an input audio track into a lip-synced talking character video, with natural mouth movement, facial animation, and stable avatar performance. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.

How do I call the Lipsync 3 Avatar API?

POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/sync/sync-lipsync-3-avatar.

How much does Lipsync 3 Avatar cost per run?

Lipsync 3 Avatar starts at $8.00 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.

What inputs does Lipsync 3 Avatar accept?

Key inputs: `image`, `audio`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/sync/sync-lipsync-3-avatar.

How do I get started with the Lipsync 3 Avatar API?

Sign up for a free WaveSpeedAI account to claim starter credits, copy your API key from /accesskey, then call the endpoint shown in the API tab of the playground. The playground also auto-generates a code sample in Python, JavaScript, or cURL for the parameters you've set.

Can I use Lipsync 3 Avatar outputs commercially?

Commercial usage rights depend on the model's license, set by its provider (Sync). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.