Nano Banana 2 & Pro Sale — 15% OFF | Apr 1–15 Only
digital-human

digital-human

Sync Lipsync 3

sync/lipsync-3

Sync Lipsync 3 synchronizes lip movements in any video to supplied audio using zero-shot lip-sync technology. Supports multiple sync modes for handling duration mismatches, works with live-action, 3D characters, and AI-generated avatars. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Input

Drag & drop or click to upload

Drag & drop or click to upload

Idle

您的請求將花費 $0.134 每次運行。

使用 $10 您可以運行此模型大約 74 次。

示例查看全部

README

LipSync-3

LipSync-3 is Sync's advanced lip synchronization model. Upload a video and an audio track — the model automatically syncs the speaker's lip movements to the new audio with high accuracy and natural motion. Supports multiple sync modes to handle length mismatches between video and audio.

Why Choose This?

  • High-accuracy lip synchronization Precisely maps audio phonemes to lip movements for natural, believable sync across a wide range of speakers and languages.

  • Flexible sync mode control Choose how the model handles video-audio length mismatches — loop, bounce, cut off, silence, or remap — to fit your specific use case.

  • Broad video compatibility Works on talking head videos, interviews, presentations, and any footage with visible facial movement.

  • Simple two-input workflow Just a video and an audio file — no manual keyframing, no masking, no technical setup required.

Parameters

ParameterRequiredDescription
videoYesInput video to apply lip sync to (URL or file upload).
audioYesAudio track to sync the lip movements to (URL, file upload, or microphone recording).
sync_modeNoHow to handle video-audio length mismatches. Options: bounce, loop, cut_off (default), silence, remap.

Sync Mode Options

  • cut_off — Cuts the output at whichever is shorter, video or audio.
  • loop — Loops the video to match the length of the audio.
  • bounce — Plays the video forward then backward repeatedly to match audio length.
  • silence — Pads the shorter input with silence or a still frame to match the longer one.
  • remap — Remaps the video timing to match the audio duration.

How to Use

  1. Upload your video — provide the talking head or speaker video via URL or drag-and-drop.
  2. Upload your audio — provide the replacement audio track via URL, file upload, or microphone recording.
  3. Select sync_mode (optional) — choose how to handle length differences between the video and audio.
  4. Submit — generate, preview, and download your lip-synced video.

Pricing

$0.134 per second of input video.

Best Use Cases

  • Dubbing & Localization — Replace the original audio with a translated voiceover and sync lip movements to match.
  • Voice Replacement — Swap out a speaker's audio while maintaining natural facial animation.
  • AI Avatar Video — Generate talking head videos with custom audio for virtual presenters and digital avatars.
  • Content Repurposing — Update or correct existing video audio without reshooting.
  • Accessibility — Create lip-synced versions of content for accessibility and localization workflows.

Pro Tips

  • Use clean, well-lit talking head footage with a clearly visible face for the most accurate sync results.
  • Minimize background noise in your audio track — cleaner audio produces better phoneme mapping.
  • If your audio is shorter than your video, use cut_off or silence mode depending on whether you want the video trimmed or padded.
  • For seamless looping content, bounce or loop modes work well when the video is shorter than the audio.

Notes

  • Both video and audio are required fields.
  • Pricing is based on the duration of the input video at $0.134 per second.
  • Ensure video and audio URLs are publicly accessible if using links rather than direct uploads.