Vidu Contest
WaveSpeed.ai
首页/探索/Kling Models/kwaivgi/kling-lipsync/text-to-video
digital-human

digital-human

Kwaivgi Kling Lipsync

kwaivgi/kling-lipsync/text-to-video

Kling TextToVideo by Kwaivgi creates videos with lifelike lip movements that precisely sync to input text for natural speaking visuals. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Input

Hint: You can drag and drop a file or click to upload

Idle

您的请求将花费 $0.14 每次运行。

使用 $10 您可以运行此模型大约 71 次。

示例查看全部

README

Kling Lipsync Text-to-Video

Make any face speak your words with AI-powered lip synchronization. Upload a video, enter your text, choose a voice, and Kling Lipsync will generate realistic lip movements perfectly matched to the synthesized speech — ideal for dubbing, content localization, and creative projects.

Why It Looks Great

  • Realistic lip sync: AI-generated mouth movements accurately match the spoken audio for natural-looking results.
  • Multiple voice options: Choose from a variety of voice characters to match your content style.
  • Bilingual support: Generate speech in English (en) or Chinese (zh).
  • Adjustable speed: Control the speaking pace with the voice speed parameter.
  • Text-driven workflow: Simply type what you want the character to say — no audio recording needed.

Parameters

ParameterRequiredDescription
videoYesSource video with a visible face (upload or public URL).
textYesThe text you want the character to speak.
voice_idYesVoice character selection (e.g., genshin_klee2).
voice_languageNoLanguage for speech synthesis: en (English) or zh (Chinese). Default: en.
voice_speedNoSpeaking speed multiplier. Default: 1.

How to Use

  1. Upload your video — drag and drop or paste a public URL. Ensure the face is clearly visible.
  2. Enter your text — type the words you want the character to speak.
  3. Select voice_id — choose a voice character that fits your content.
  4. Choose language — select en for English or zh for Chinese.
  5. Adjust speed (optional) — modify voice_speed to speak faster or slower.
  6. Run — click the button to generate.
  7. Download — preview and save your lip-synced video.

Pricing

Flat rate per generation.

OutputCost
Per video$0.14

Best Use Cases

  • Content Localization — Dub videos into different languages while maintaining natural lip movements.
  • Social Media & Entertainment — Create fun talking videos, memes, and viral content.
  • E-learning & Training — Generate instructional videos with consistent narration.
  • Marketing & Advertising — Produce multilingual ad variants from a single video shoot.
  • Character Animation — Bring static or animated characters to life with synchronized speech.

Pro Tips for Best Results

  • Use videos with clear, front-facing shots of the face for the most accurate lip sync.
  • Keep text length appropriate for the video duration — shorter clips work best with concise messages.
  • Match the voice character to the visual appearance for more believable results.
  • Test different voice_speed values to find the natural pacing for your content.
  • For multilingual projects, generate separate versions with appropriate voice_language settings.
  • Ensure good lighting on the face in the source video for cleaner lip tracking.

Notes

  • If using a URL for the video, ensure it is publicly accessible. A preview thumbnail confirms successful loading.
  • The face must be clearly visible throughout the video for accurate lip synchronization.
  • Processing time may vary based on video length and current queue load.
  • Best results are achieved with videos where the subject is speaking or has a neutral expression.