Sync Lipsync 3 Avatar

Playground

Sync Lipsync 3 Avatar turns a single still image and an input audio track into a lip-synced talking character video, with natural mouth movement, facial animation, and stable avatar performance. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

Sync Lipsync 3 Avatar

Sync Lipsync 3 Avatar turns a single still image into a talking character video driven by an input audio track. It is designed for portrait avatars, illustrated characters, and animated-frame references that need natural lip synchronization.

Why Choose This?

Image-to-video avatar generation
Animate a still image into a talking video using only an image and an audio file.
Audio-driven timing
The generated video follows the duration and speech timing of the input audio.
Works with multiple visual styles
Supports realistic portraits, illustrations, and animated character frames.
Simple production workflow
No prompt or extra tuning is required; provide the reference image and audio track.
Standard video output
The generated video is returned as a URL in the standard WaveSpeed prediction response.

Parameters

Parameter	Required	Description
image	Yes	Input image URL. Use a clear face image in JPEG, PNG, or WebP format.
audio	Yes	Input audio URL. The output video follows the audio duration.

How to Use

Upload an image - Provide a clear portrait or character image with a visible face.
Upload audio - Provide the speech or singing audio that should drive the lip-sync.
Submit - Generate the talking avatar video.
Download - Use the returned video URL in your editing or publishing workflow.

Pricing

Pricing is $8.00 per minute of input audio, billed proportionally by exact audio duration.

Audio Duration	Price
10s	$1.33
11s	$1.47
30s	$4.00
60s	$8.00
90s	$12.00

Best Use Cases

Talking avatars - Create presenter or character videos from a single image.
Localized narration - Generate avatar videos from translated or re-recorded voice tracks.
Character dialogue - Animate illustrated or stylized character frames for dialogue scenes.
Marketing clips - Produce short spokesperson videos from static brand assets.
Education and demos - Create simple explainers, tutorials, or training clips without filming.

Pro Tips

Use a clear face image with visible mouth and eyes.
Avoid heavy occlusion, extreme angles, or very low-resolution images.
Use clean audio with minimal background noise for better lip-sync.
Trim silence before upload if you want tighter timing and lower cost.
Make sure image and audio URLs are publicly accessible.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/sync/lipsync-3/avatar" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
image	string	Yes		-	Input image URL. Use a clear face image in JPEG, PNG, or WebP format.
audio	string	Yes	-	-	Input audio URL. The output video follows the audio duration.

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	string	Array of URLs to the generated content (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Sync Lipsync 3 Sync React 1