WaveSpeedAI Desktop is Available Now!Try it
Home/Explore/Kling Models/kwaivgi/kling-v1-ai-avatar-standard
image-to-video

image-to-video

Kling AI Avatar

kwaivgi/kling-v1-ai-avatar-standard

Kling AI Avatar produces stunning AI-generated video avatars for digital identity and content creation, with on-demand video billed at $0.25 per 5 seconds. Ready-to-use REST API, no coldstarts, affordable pricing.

Hint: You can drag and drop a file or click to upload

Hint: You can drag and drop a file or click to upload

preview

Idle

Your request will cost $0.25 per run.

For $10 you can run this model approximately 40 times.

One more thing::

ExamplesView all

README

Kuaivgi Kling v1 AI Avatar Standard — Audio-Driven Talking Portrait

Turn a single portrait into a natural talking-head video driven by your audio. The Standard tier focuses on clean lip-sync and stable identity at a budget-friendly rate—great for explainers, support avatars, internal training, and product demos.

Highlights

  • Phoneme-aligned lip-sync with natural eye blinks and head motion
  • Identity-preserving generation from one image
  • Works with real recordings or TTS audio
  • Optional prompt to nudge framing, background vibe, or style
  • Fast, reliable outputs suitable for everyday production

Parameters

  • audio (required): speech track; duration determines the clip length
  • image (required): clear, front-facing portrait (URL or upload)
  • prompt (optional): short guidance for mood, background, or framing

Recommended inputs

  • Portrait: even lighting, minimal occlusion, 512 px or larger
  • Audio: clean voice, 16–48 kHz, avoid heavy music/reverb

How to Use

  1. Upload or paste the audio URL.
  2. Upload or paste the portrait image URL.
  3. (Optional) Add a brief prompt to describe background tone or framing.
  4. Press Run and download the generated avatar video.

Tips

  • Trim long silences to reduce cost and tighten pacing.
  • Keep headroom consistent across images if you plan a series.
  • Use a high-quality mic or TTS for crisp consonants and better lip-sync.

Pricing

Price per second: $0.05

Billing rules

  1. Minimum charge: 5 seconds.
  2. Maximum billable length: 600 seconds (10 minutes) → $30.00 cap.
  3. Currency rounding: totals are rounded to the nearest cent.