Kling V2 AI Avatar Pro | AI Digital Human API

Kling-v2-ai-avatar-pro — Talking Avatar from Image + Audio

kling-v2-ai-avatar-pro turns a single portrait into a lip-synced talking-head video driven by your own audio. Upload a clear face image, provide a narration or dialogue track, and the model generates a vertical HD avatar clip that speaks and moves naturally on camera.

🌟 Highlights

Audio-driven performance – Uses your uploaded audio as-is (no TTS), keeping timing, pauses and emotion.
Photo-real talking avatar – Animates the face, eyes and head while preserving the identity from the reference image.
One-shot setup – Just an image + audio; no need for video capture or motion recording.
Portrait-ready output – Produces social-ready vertical video that fits Reels, TikTok, Shorts and story formats.
Prompt-guided styling (optional) – Use prompt to hint at camera feel or mood (e.g. “soft studio lighting, subtle head movement, gentle smile”).

🔧 Parameters

audio* – Required. The voice track that drives lip-sync and timing (URL or upload).
image* – Required. A clear, front-facing portrait of the person to animate.
prompt – Optional text describing style, expression or camera feel. If omitted, the model uses a neutral talking-head style.

Tip: Use a well-lit, unobstructed face (no heavy motion blur, minimal occlusion) for best identity preservation.

🚀 How to Use

Upload audio

Clean mono/stereo track, with minimal background noise.
Make sure the final edited length matches what you want in the video.

Upload image

Front or 3/4 view, eyes visible, face not cropped.
The avatar’s identity and pose come from this image.

(Optional) Add a prompt

Guide expression or style, e.g.:
“confident presenter in a tech promo, subtle head nods”
“friendly customer service tone, warm expression”

Run the model

The video length is automatically derived from the audio duration.
Download the generated talking-head clip and drop it into your editor or directly onto social platforms.

💰 Pricing

Billing is based on audio duration, with a minimum of 5 seconds.

Audio length (s)	Billed seconds	Price (USD)
0–5	5	0.56
10	10	1.12
20	20	2.24
30	30	3.36
60	60	6.72

Any clip shorter than 5 seconds is still billed as 5 seconds.

🧠 Tips for Best Results

Edit your audio first – Remove mistakes, long silences and background noise before upload.
Match tone to use case – Calm, even delivery for corporate avatars; more expressive reads for ads or UGC.
Keep framing consistent – Use images with similar head size and framing across a campaign for a unified look.
Test a few portraits – Small changes in the reference image (lighting, angle) can noticeably change the avatar’s feel.

More Avatar Tools

See our Avatar Tools collection here!

infinitetalk – WaveSpeedAI’s Infinitetalk generates lip-synced talking-head avatar videos from your scripts or audio, ideal for virtual presenters and explainer content.
Infinitetalk-muti – WaveSpeedAI’s Infinitetalk-Multi extends the avatar pipeline to multi-speaker / multi-segment scenarios, making it easier to script dialogues, panel shots, or batch avatar content.
Omni-Human – ’s Omni-Human 1.5 creates high-fidelity digital humans from images and audio, suitable for realistic virtual hosts, brand ambassadors, and training avatars.

Kling v2 Ai Avatar Pro API — Quick start

Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/kwaivgi/kling-v2-ai-avatar-pro with your input as JSON. The endpoint returns a prediction id. Start polling the result endpoint around every 2 seconds, increase the interval for long-running tasks, and stop on any terminal status. On completed, read output values from data.outputs. Examples for Kling v2 Ai Avatar Pro below.

HTTP example

set -euo pipefail

: "${WAVESPEED_API_KEY:?Set WAVESPEED_API_KEY}"

REQUEST_BODY=$(cat <<'JSON'
{
    "image": "https://interactive-examples.mdn.mozilla.net/media/cc0-images/painted-hand-298-332.jpg",
    "audio": "https://interactive-examples.mdn.mozilla.net/media/cc0-audio/t-rex-roar.mp3"
}
JSON
)

# 1. Submit the prediction.
SUBMIT_RESPONSE=$(curl --silent --show-error --fail-with-body \
  -X POST "https://api.wavespeed.ai/api/v3/kwaivgi/kling-v2-ai-avatar-pro" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d "$REQUEST_BODY")

TASK=$(printf '%s' "$SUBMIT_RESPONSE" | jq 'if has("data") then .data else . end')
PREDICTION_ID=$(printf '%s' "$TASK" | jq -r '.id')
if [ -z "$PREDICTION_ID" ] || [ "$PREDICTION_ID" = "null" ]; then
  printf 'Submission response did not contain a prediction id
' >&2
  exit 1
fi
RESULT_URL=$(printf '%s' "$TASK" | jq -r '.urls.get // empty')
if [ -z "$RESULT_URL" ]; then
  RESULT_URL="https://api.wavespeed.ai/api/v3/predictions/$PREDICTION_ID/result"
fi

# 2. Poll until the prediction finishes.
while true; do
  RESPONSE=$(curl --silent --show-error --fail-with-body "$RESULT_URL" \
    -H "Authorization: Bearer $WAVESPEED_API_KEY")
  RESULT=$(printf '%s' "$RESPONSE" | jq 'if has("data") then .data else . end')
  STATUS=$(printf '%s' "$RESULT" | jq -r '.status')
  case "$STATUS" in
    completed) printf '%s\n' "$RESULT" | jq '.outputs'; break ;;
    failed|cancelled|timeout) printf '%s\n' "$RESULT" | jq . >&2; exit 1 ;;
    created|processing) sleep 2 ;;
    *) printf 'Unexpected status: %s
' "$STATUS" >&2; exit 1 ;;
  esac
done

Node.js example

const submitUrl = "https://api.wavespeed.ai/api/v3/kwaivgi/kling-v2-ai-avatar-pro";
const apiKey = process.env.WAVESPEED_API_KEY;
if (!apiKey) throw new Error('Set WAVESPEED_API_KEY');

async function requestJson(url, options = {}) {
  const response = await fetch(url, options);
  if (!response.ok) throw new Error(await response.text());
  return response.json();
}

// 1. Submit the prediction.
const body = await requestJson(submitUrl, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${apiKey}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
        "image": "https://interactive-examples.mdn.mozilla.net/media/cc0-images/painted-hand-298-332.jpg",
        "audio": "https://interactive-examples.mdn.mozilla.net/media/cc0-audio/t-rex-roar.mp3"
}),
});
const task = body.data ?? body;
if (!task.id) throw new Error("Submission response did not contain a prediction id");
const resultUrl = task.urls?.get ||
  `https://api.wavespeed.ai/api/v3/predictions/${task.id}/result`;

// 2. Poll until the prediction finishes.
while (true) {
  const resultBody = await requestJson(resultUrl, {
    headers: { "Authorization": `Bearer ${apiKey}` },
  });
  const result = resultBody.data ?? resultBody;
  if (result.status === "completed") {
    console.log(result.outputs);
    break;
  }
  if (["failed", "cancelled", "timeout"].includes(result.status)) throw new Error(JSON.stringify(result));
  if (!["created", "processing"].includes(result.status)) throw new Error("Unexpected status: " + result.status);
  await new Promise(resolve => setTimeout(resolve, 2000));
}

Python example

import json
import os
import time
from urllib.request import Request, urlopen

api_key = os.environ["WAVESPEED_API_KEY"]
headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
payload = {
    "image": "https://interactive-examples.mdn.mozilla.net/media/cc0-images/painted-hand-298-332.jpg",
    "audio": "https://interactive-examples.mdn.mozilla.net/media/cc0-audio/t-rex-roar.mp3"
}

def request_json(url, data=None):
    request = Request(url, data=data, headers=headers, method="POST" if data else "GET")
    with urlopen(request) as response:
        return json.load(response)

# 1. Submit the prediction.
body = request_json("https://api.wavespeed.ai/api/v3/kwaivgi/kling-v2-ai-avatar-pro", json.dumps(payload).encode())
task = body.get("data", body)
if not task.get("id"):
    raise RuntimeError("Submission response did not contain a prediction id")
result_url = task.get("urls", {}).get("get") or f"https://api.wavespeed.ai/api/v3/predictions/{task['id']}/result"

# 2. Poll until the prediction finishes.
while True:
    result_body = request_json(result_url)
    result = result_body.get("data", result_body)
    status = result.get("status")
    if status == "completed":
        print(result.get("outputs", []))
        break
    if status in {"failed", "cancelled", "timeout"}:
        raise RuntimeError(result)
    if status not in {"created", "processing"}:
        raise RuntimeError(f"Unexpected status: {status}")
    time.sleep(2)

Kling v2 Ai Avatar Pro API — Frequently asked questions

What is the Kling v2 Ai Avatar Pro API?

Kling v2 Ai Avatar Pro is a Kuaishou model for talking-avatar generation, exposed as a REST API on WaveSpeedAI. Kling V2 AI Avatar Pro generates high-quality AI avatar videos with clean detail, stable motion, and strong identity consistency—ideal for profiles, intros, and social content. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.

How do I call the Kling v2 Ai Avatar Pro API?

POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID. Poll the result endpoint starting around every 2 seconds, increase the interval for long-running tasks, and stop on any terminal status. The playground generates production-oriented Python, JavaScript, and cURL examples with timeouts, transient-error handling, and safe GET retries. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/kwaivgi/kwaivgi-kling-v2-ai-avatar-pro.

How much does Kling v2 Ai Avatar Pro cost per run?

Kling v2 Ai Avatar Pro starts at $0.56 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.

What inputs does Kling v2 Ai Avatar Pro accept?

Key inputs: `prompt`, `image`, `audio`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/kwaivgi/kwaivgi-kling-v2-ai-avatar-pro.

How long does Kling v2 Ai Avatar Pro take to generate?

Median end-to-end generation time on WaveSpeedAI is around 283 seconds per request, based on recent successful runs. Queue time varies with global demand; live status is visible in the prediction record.

Can I use Kling v2 Ai Avatar Pro outputs commercially?

Commercial usage rights depend on the model's license, set by its provider (Kuaishou). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.

PrzykładyZobacz wszystkie

Powiązane modele

README

Kling-v2-ai-avatar-pro — Talking Avatar from Image + Audio

🌟 Highlights

🔧 Parameters

🚀 How to Use

💰 Pricing

🧠 Tips for Best Results

More Avatar Tools

Kling v2 Ai Avatar Pro API — Quick start

Kling v2 Ai Avatar Pro API — Frequently asked questions

Dowiedz się więcej

Informacje prawne

Zasoby

Modele

Narzędzia