Kwaivgi Kling V2 Ai Avatar Pro

Playground

Kling V2 AI Avatar Pro generates high-quality AI avatar videos with clean detail, stable motion, and strong identity consistency—ideal for profiles, intros, and social content. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

Kling-v2-ai-avatar-pro — Talking Avatar from Image + Audio

kling-v2-ai-avatar-pro turns a single portrait into a lip-synced talking-head video driven by your own audio. Upload a clear face image, provide a narration or dialogue track, and the model generates a vertical HD avatar clip that speaks and moves naturally on camera.

🌟 Highlights

Audio-driven performance – Uses your uploaded audio as-is (no TTS), keeping timing, pauses and emotion.
Photo-real talking avatar – Animates the face, eyes and head while preserving the identity from the reference image.
One-shot setup – Just an image + audio; no need for video capture or motion recording.
Portrait-ready output – Produces social-ready vertical video that fits Reels, TikTok, Shorts and story formats.
Prompt-guided styling (optional) – Use prompt to hint at camera feel or mood (e.g. “soft studio lighting, subtle head movement, gentle smile”).

🔧 Parameters

audio* – Required. The voice track that drives lip-sync and timing (URL or upload).
image* – Required. A clear, front-facing portrait of the person to animate.
prompt – Optional text describing style, expression or camera feel. If omitted, the model uses a neutral talking-head style.

Tip: Use a well-lit, unobstructed face (no heavy motion blur, minimal occlusion) for best identity preservation.

🚀 How to Use

Upload audio
- Clean mono/stereo track, with minimal background noise.
- Make sure the final edited length matches what you want in the video.
Upload image
- Front or 3/4 view, eyes visible, face not cropped.
- The avatar’s identity and pose come from this image.
(Optional) Add a prompt
- Guide expression or style, e.g.:
  - “confident presenter in a tech promo, subtle head nods”
  - “friendly customer service tone, warm expression”
Run the model
- The video length is automatically derived from the audio duration.
- Download the generated talking-head clip and drop it into your editor or directly onto social platforms.

💰 Pricing

Billing is based on audio duration, with a minimum of 5 seconds.

Audio length (s)	Billed seconds	Price (USD)
0–5	5	0.56
10	10	1.12
20	20	2.24
30	30	3.36
60	60	6.72

Any clip shorter than 5 seconds is still billed as 5 seconds.

🧠 Tips for Best Results

Edit your audio first – Remove mistakes, long silences and background noise before upload.
Match tone to use case – Calm, even delivery for corporate avatars; more expressive reads for ads or UGC.
Keep framing consistent – Use images with similar head size and framing across a campaign for a unified look.
Test a few portraits – Small changes in the reference image (lighting, angle) can noticeably change the avatar’s feel.

More Avatar Tools

See our Avatar Tools collection here!

infinitetalk – WaveSpeedAI’s Infinitetalk generates lip-synced talking-head avatar videos from your scripts or audio, ideal for virtual presenters and explainer content.
Infinitetalk-muti – WaveSpeedAI’s Infinitetalk-Multi extends the avatar pipeline to multi-speaker / multi-segment scenarios, making it easier to script dialogues, panel shots, or batch avatar content.
Omni-Human – ByteDance’s Omni-Human 1.5 creates high-fidelity digital humans from images and audio, suitable for realistic virtual hosts, brand ambassadors, and training avatars.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/kwaivgi/kling-v2-ai-avatar-pro" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
image	string	Yes		-	The image for generating the output.
audio	string	Yes	-	-	The audio for generating the output.
prompt	string	No		-	The positive prompt for the generation.

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.has_nsfw_contents	array	Array of boolean values indicating NSFW detection for each output
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	string	Array of URLs to the generated content (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Kwaivgi Kling V1.6 T2V Standard Kwaivgi Kling V2 AI Avatar Standard