Sync Lipsync 3 Avatar
Playground
Try it on WavespeedAI!Sync Lipsync 3 Avatar turns a single still image and an input audio track into a lip-synced talking character video, with natural mouth movement, facial animation, and stable avatar performance. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Features
Sync Lipsync 3 Avatar
Sync Lipsync 3 Avatar turns a single still image into a talking character video driven by an input audio track. It is designed for portrait avatars, illustrated characters, and animated-frame references that need natural lip synchronization.
Why Choose This?
-
Image-to-video avatar generation
Animate a still image into a talking video using only an image and an audio file. -
Audio-driven timing
The generated video follows the duration and speech timing of the input audio. -
Works with multiple visual styles
Supports realistic portraits, illustrations, and animated character frames. -
Simple production workflow
No prompt or extra tuning is required; provide the reference image and audio track. -
Standard video output
The generated video is returned as a URL in the standard WaveSpeed prediction response.
Parameters
| Parameter | Required | Description |
|---|---|---|
| image | Yes | Input image URL. Use a clear face image in JPEG, PNG, or WebP format. |
| audio | Yes | Input audio URL. The output video follows the audio duration. |
How to Use
- Upload an image - Provide a clear portrait or character image with a visible face.
- Upload audio - Provide the speech or singing audio that should drive the lip-sync.
- Submit - Generate the talking avatar video.
- Download - Use the returned video URL in your editing or publishing workflow.
Pricing
Pricing is $8.00 per minute of input audio, billed proportionally by exact audio duration.
| Audio Duration | Price |
|---|---|
| 10s | $1.33 |
| 11s | $1.47 |
| 30s | $4.00 |
| 60s | $8.00 |
| 90s | $12.00 |
Best Use Cases
- Talking avatars - Create presenter or character videos from a single image.
- Localized narration - Generate avatar videos from translated or re-recorded voice tracks.
- Character dialogue - Animate illustrated or stylized character frames for dialogue scenes.
- Marketing clips - Produce short spokesperson videos from static brand assets.
- Education and demos - Create simple explainers, tutorials, or training clips without filming.
Pro Tips
- Use a clear face image with visible mouth and eyes.
- Avoid heavy occlusion, extreme angles, or very low-resolution images.
- Use clean audio with minimal background noise for better lip-sync.
- Trim silence before upload if you want tighter timing and lower cost.
- Make sure image and audio URLs are publicly accessible.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/sync/lipsync-3/avatar" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| image | string | Yes | - | Input image URL. Use a clear face image in JPEG, PNG, or WebP format. | |
| audio | string | Yes | - | - | Input audio URL. The output video follows the audio duration. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | string | Array of URLs to the generated content (empty when status is not completed). |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |