Sync React 1

Playground

Sync React-1 is a production-grade video-to-video lip-sync model. It maps any speech track to a target face, producing phoneme-accurate visemes and smooth timing while preserving identity, head pose, lighting, and background. Supports emotion and intensity control, multilingual speech, and long takes for talking-head content. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing.

Features

sync/react-1 (Audio-to-Video Lip Sync & Facial Animation)

sync/react-1 is a production-ready audio-driven video animation model that syncs a subject in a video to an input audio track. It supports selectable emotion control and multiple animation modes (lips / face / head) to help you generate natural, expressive results for short clips with minimal setup.

Why it stands out

Audio-to-video sync for fast talking-head and reaction style outputs.
Emotion presets (happy / sad / angry / disgusted / surprised / neutral) to steer overall expression.
Multiple control modes so you can animate only lips, or drive broader face/head motion.
Simple workflow and predictable pricing for quick iteration.

Capabilities

Audio-driven lip sync for an input video
Emotion-conditioned expression steering
Mode control for different animation scopes:
- lips: focus on mouth movement
- face: include facial expression changes
- head: include head motion cues (where supported by the model)

Parameters

Parameter	Description
video*	Input video file or public URL.
audio*	Input audio file or public URL.
emotion	Expression preset: happy / sad / angry / disgusted / surprised / neutral.
model_mode	Animation scope: lips / face / head.

Pricing

Video Duration (s)	Total Price
1	$0.167
2	$0.334
3	$0.501
4	$0.668
5	$0.835

How to use

Upload the video (best with a clear, front-facing subject).
Upload the audio (speech, voiceover, or short dialogue).
Choose emotion to steer expression tone.
Choose model_mode (lips / face / head).
Run the model and download the synced result.

Best Use Cases

Short talking-head clips for creators and social media
Dubbing and voiceover sync for character shots
Expressive reaction clips with controlled emotion
Rapid prototyping for dialogue-driven video concepts

Notes

Best results: single subject, stable lighting, minimal motion blur, and a visible face.
Use lips mode for the most conservative edits; use face/head when you want stronger performance and expression.
Very long videos are billed at the 5-second cap, so trim to the segment you want to animate.

More Digital Human Models

wavespeed-ai/infinitetalk — Create realistic talking-head digital humans from a single portrait and audio, delivering stable lip sync and natural facial motion for voice-driven avatar videos.
wavespeed-ai/infinitetalk/multi — Multi-person talking avatar generation that syncs multiple faces to audio with consistent expressions and timing, ideal for dialogues, interviews, and group scenes.
kwaivgi/kling-v2-ai-avatar-pro — Pro-grade AI avatar video generation for high-fidelity digital humans with strong identity consistency and polished, production-ready results for marketing and creator content.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/sync/react-1" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "emotion": "neutral",
    "model_mode": "face"
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
video	string	Yes		-	Input video file (.mp4)
audio	string	Yes	-	-	Input audio file (.wav)
emotion	string	No	neutral	happy, sad, angry, disgusted, surprised, neutral	Emotion prompt for the generation (single word emotions only)
model_mode	string	No	face	lips, face, head	Edit region for the model (lips/face/head). When head is selected, model generates natural talking head movements along with emotions + lipsync

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.has_nsfw_contents	array	Array of boolean values indicating NSFW detection for each output
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	string	Array of URLs to the generated content (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Sync Lipsync 2 Pro Veed Fabric 1.0