WaveSpeedAI APIKwaivgi Kling Lipsync Text to Video

Kwaivgi Kling Lipsync Text To Video

Kwaivgi Kling Lipsync Text To Video

Playground

Try it on WavespeedAI!

Kling TextToVideo is an advanced text-to-video model designed to create lifelike lip movements that perfectly sync with the given text input.

Features

Kling lip-sync is an advanced text-to-video model designed to create lifelike lip movements that perfectly sync with the given text input. The lip-sync feature can perfectly synchronize the lip movements of characters in videos generated by Kling AI with locally recorded or online-generated dubbing/singing files, achieving the effect of real people speaking or singing, and making the video appear lifelike!

Naturally and Highly Matched Lip Movements:
The lip movements of characters in the generated video not only synchronize precisely with the audio but also create unique movement trajectories based on individual facial features and physiological structures, significantly enhancing the video’s naturalness and realism.

Clear Facial Muscle Texture:
The changes in lip movements accurately drive the facial muscles, adjusting in real-time and meticulously presenting the stretching and contraction of muscles during lip movement, resulting in a highly coordinated visual effect that further enhances the overall realism and immersive quality of the video.

Vivid and Lifelike Imagery:
The areas outside the face in the generated video remain consistent with the original video, ensuring the integrity and continuity of the original footage, avoiding interference with non-target areas during the generation process, and restoring the original appearance of the video to the greatest extent possible.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/kwaivgi/kling-lipsync/text-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "video": "https://replicate.delivery/xezq/ipjGAn65es3cfkPhe7g3IvYQqDfbHeCfBof1ujrrMvb3rQAXKA/tmptucl_ok3.mp4",
    "text": "Kling lipsync on WaveSpeedAI is an AI-powered model that generates realistic lip movements from text input.",
    "voice_id": "genshin_klee2",
    "voice_language": "en",
    "voice_speed": 1
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
videostringYes-The URL of the video file for generating synchronized lip movements. Video files support .mp4/.mov, file size does not exceed 100MB, video length does not exceed 10s and is not shorter than 2s, only 720p and 1080p are supported, length and width dimensions should both be between 720px and 1920px.
textstringYesKling lipsync on WaveSpeedAI is an AI-powered model that generates realistic lip movements from text input.-Text Content for Lip-Sync Video Generation. Max 120 characters.
voice_idstringYesgenshin_klee2genshin_vindi2, zhinen_xuesheng, AOT, ai_shatang, genshin_klee2, genshin_kirara, ai_kaiya, oversea_male1, ai_chenjiahao_712, girlfriend_4_speech02, chat1_female_new-3, chat_0407_5-1, cartoon-boy-07, uk_boy1, cartoon-girl-01, PeppaPig_platform, ai_huangzhong_712, ai_huangyaoshi_712, ai_laoguowang_712, chengshu_jiejie, you_pingjing, calm_story1, uk_man2, laopopo_speech02, heainainai_speech02Voice ID to use for speech synthesis
voice_languagestringNoenzh, enThe voice language corresponding to the Voice ID
voice_speednumberNo10.8 ~ 2.0Speech rate for Text to Video generation

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Query Parameters

Result Request Parameters

ParameterTypeRequiredDefaultDescription
idstringYes-Task ID

Result Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
dataobjectThe prediction data object containing all details
data.idstringUnique identifier for the prediction, the ID of the prediction to get
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds
© 2025 WaveSpeedAI. All rights reserved.