Browse ModelsSyncSync Lipsync 2 Pro

Sync Lipsync 2 Pro

Sync Lipsync 2 Pro

Playground

Try it on WavespeedAI!

Lipsync-2-pro creates studio-grade lip synchronization for video-to-video editing in minutes, not weeks. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

sync/lipsync-2-pro — Pro Audio-to-Lipsync

Lipsync-2-pro is a zero-shot model for generating realistic lip movements that match spoken audio. It works out of the box—no training or fine-tuning—and preserves the speaker’s natural style across languages, cameras, and video types. From live-action footage to animated or AI-generated faces, it brings broadcast-grade dubbing and dialogue editing into a simple API call.


What it does

  • Zero-shot lipsync – Just provide a video and an audio track; the model automatically re-animates the mouth to match the speech.
  • Style preservation – Learns each speaker’s characteristic timing and articulation and keeps that “signature delivery” even when the language changes.
  • Cross-domain support – Works on real humans, 2D/3D animation, and synthetic/AI avatars.
  • Flexible workflows – Use it for dubbing, rewriting individual lines in post, or re-animating entire performances.

Key capabilities

  • Expressiveness control – Internally balances how “subtle vs. animated” the lipsync is, so results can match calm talking-head news, energetic vlogs, or stylised characters.
  • Active speaker detection – In multi-person scenes, the system can focus lipsync on the currently speaking face instead of blindly animating everyone.
  • High-fidelity animation – Preserves identity, lighting, background and facial structure; only the mouth and local expressions are changed.
  • Record once, edit forever – Change lines after shooting without new takes; keep the original performance and camera work.
  • AI dubbing for any video – Pair with TTS or translation models to localise existing content into new languages.

Parameters

  • video* Input video to be re-synced (URL or upload). Works best with relatively stable talking-head or upper-body shots.

  • audio* Target speech (URL or upload). The new lip motion will follow this track.

  • sync_mode Controls how audio and video lengths are aligned:

    • cut_off – Trim to the shorter of audio or video (default, safest).
    • loop – Loop the shorter track until the longer one finishes.
    • bounce – Ping-pong the video when looping (forward–back–forward…).
    • silence – Pad missing audio with silence.
    • remap – Time-warp to better match durations.

Output: a new MP4 video with updated lipsync.


How to use

  1. Upload or paste URLs for video and audio.
  2. Choose an appropriate sync_mode based on whether you want trimming, looping, or padding.
  3. Submit the job and wait for processing to complete.
  4. Download the result and review; if needed, adjust audio, sync_mode or source video and re-run.

Pricing

Billing is purely based on audio length.

  • Base rate: $0.08 per second of audio
Audio length (seconds)Price (USD)
5$0.40
10$0.80
15$1.20
30$2.40
60$4.80

You can estimate other costs by multiplying the audio duration (in seconds) by $0.08/s; charges scale linearly with length.


More Models to Try

  • WaveSpeedAI / InfiniteTalk WaveSpeedAI’s single-avatar talking-head model that turns one photo plus audio into smooth, lip-synced digital presenter videos for tutorials, marketing, and social content.

  • WaveSpeedAI / InfiniteTalk Multi Multi-avatar version of InfiniteTalk that drives several characters in one scene from separate audio tracks, ideal for dialog-style explainers, interviews, and role-play videos.

  • Kwaivgi / Kling V2 AI Avatar Standard Cost-effective Kling-based AI avatar model that generates natural talking-face videos from a single reference image and voice track, suitable for everyday content and customer support.

  • Kwaivgi / Kling V2 AI Avatar Pro Higher-fidelity Kling V2 avatar model for premium digital humans, offering smoother motion, better lip-sync, and more stable faces for commercials, brand spokespeople, and product demos.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/sync/lipsync-2-pro" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "sync_mode": "cut_off"
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
videostringYes-The video to be used for generation
audiostringYes--The audio to be used for generation
sync_modestringNocut_offbounce, loop, cut_off, silence, remapDefines how to handle duration mismatches between video and audio inputs. See the Media Content Tips guide https://docs.sync.so/compatibility-and-tips/media-content-tips#sync-mode-options for a brief overview, or the SyncMode enum below for detailed explanations of each option.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Request Parameters

ParameterTypeRequiredDefaultDescription
idstringYes-Task ID

Result Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
dataobjectThe prediction data object containing all details
data.idstringUnique identifier for the prediction, the ID of the prediction to get
data.modelstringModel ID used for the prediction
data.outputsstringArray of URLs to the generated content (empty when status is not completed).
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds
© 2025 WaveSpeedAI. All rights reserved.