Nano Banana 2 & Pro Sale — 15% OFF | Apr 1–15 Only

Kling Models

Kuaishou's Kling delivers state-of-the-art AI video generation with superior realism.

Kuaishou's Kling delivers state-of-the-art AI video generation with superior realism.

All Models

52 models
motion-control

kwaivgi/kling-v2.6-pro/motion-control

Kling 2.6 Pro Motion Control turns reference motion clips (dance, action, gesture) into smooth, realistic animations. Upload a character image (or source video) and a motion video; the model transfers the movement while preserving identity and temporal consistency. Ready-to-use REST API with fast response, native-audio option, no cold starts, and affordable pricing.

image-to-video

kwaivgi/kling-v3.0-pro/image-to-video

Kling 3.0 Pro delivers top-tier image-to-video generation with smooth motion, cinematic visuals, accurate prompt adherence, and native audio for ready-to-share clips. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

motion-control

kwaivgi/kling-v3.0-std/motion-control

Kling 3.0 Standard Motion Control transfers motion from reference videos to animate still images. Upload a character image and a motion clip (dance, action, gesture), and the model extracts the movement to generate smooth, realistic video. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

text-to-video

kwaivgi/kling-v3.0-pro/text-to-video

Kling 3.0 Pro delivers top-tier text-to-video generation with smooth motion, cinematic visuals, accurate prompt adherence, and native audio for ready-to-share clips. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

image-to-video

kwaivgi/kling-v3.0-std/image-to-video

Kling 3.0 Standard delivers high-quality image-to-video generation with smooth motion, cinematic visuals, accurate prompt adherence, and native audio for ready-to-share clips. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

text-to-video

kwaivgi/kling-v3.0-std/text-to-video

Kling 3.0 Standard delivers high-quality text-to-video generation with smooth motion, cinematic visuals, accurate prompt adherence, and native audio for ready-to-share clips. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

image-to-image

kwaivgi/kling-image-v3/edit

Kling V3 Edit is an AI model for editing and transforming images via text prompts, enabling precise modifications with natural-language instructions. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-image

kwaivgi/kling-image-v3/text-to-image

Kling V3.0 is Kuaishou's latest AI image generation model with superior text-to-image capabilities, delivering high-quality visuals with accurate prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-v2.6-std/image-to-video

Kling 2.6 Standard offers cost-effective image-to-video generation with smooth motion, cinematic visuals, and accurate prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-video

kwaivgi/kling-v2.6-std/text-to-video

Kling 2.6 Standard offers cost-effective text-to-video generation with smooth motion, cinematic visuals, and strong prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

motion-control

kwaivgi/kling-v2.6-std/motion-control

Kling 2.6 Standard Motion Control transfers motion from reference videos to animate still images. Upload a character image and a motion clip (dance, action, gesture), and the model extracts the movement to generate smooth, realistic video. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

image-to-video

kwaivgi/kling-v2.5-turbo-std/image-to-video

Kling 2.5 Turbo Std delivers image-to-video with fluid motion, cinematic visuals, and precise prompts at 25% lower pricing vs 2.1 Std. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-image

kwaivgi/kling-image-o1

Kling Omni Image O1 is Kuaishou's multi-modal image generation model with MVL technology. Supports up to 10 reference images for feature consistency, precise detail editing (add/remove/modify), style control, and series content creation. Perfect for IP character design, comic panels, and brand merchandise. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-video-o1-std/image-to-video

Kling Omni Video O1 Image-to-Video (Standard) turns static images into dynamic, high-quality videos while preserving subject identity and visual/temporal consistency. It adds natural motion, realistic physics, and smooth scene dynamics, and supports flexible clip durations when reference frames are provided. Built for stable production use and cost efficiency with a ready-to-use REST API, fast response, no cold starts, and predictable pricing.

image-to-video

kwaivgi/kling-video-o1-std/reference-to-video

Kling Omni Video O1 (Standard) Reference-to-Video generates creative videos using character, prop, or scene references from multiple viewpoints. Extracts subject features and creates new video content while maintaining identity consistency across frames. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

video-to-video

kwaivgi/kling-video-o1-std/video-edit

Kling Omni Video O1 Video-Edit (Standard) enables natural-language video edits: remove or replace objects, swap backgrounds, restyle scenes, change weather/lighting, and apply localized 3–10s transformations with strong temporal consistency. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing.

image-to-video

kwaivgi/kling-video-o1/reference-to-video

Kling Omni Video O1 Reference-to-Video generates creative videos using character, prop, or scene references from multiple viewpoints. Extracts subject features and creates new video content while maintaining identity consistency across frames. Ready-to-use REST API, best performance, no cold starts, affordable pricing.

text-to-video

kwaivgi/kling-video-o1/text-to-video

Kling Omni Video O1 is Kuaishou's first unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Text-to-Video mode generates cinematic videos from text prompts with subject consistency, natural physics simulation, and precise semantic understanding. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

video-to-video

kwaivgi/kling-video-o1/video-edit

Kling Omni Video O1 Video-Edit enables conversational video editing through natural language commands. Remove objects, change backgrounds, modify styles, adjust weather/lighting, and transform scenes with simple text instructions like 'remove pedestrians' or 'change daytime to dusk'. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

video-to-video

kwaivgi/kling-video-o1/video-edit-fast

Kling Omni Video O1 Video-Edit enables conversational video editing through natural language commands. Remove objects, change backgrounds, modify styles, adjust weather/lighting, and transform scenes with simple text instructions like 'remove pedestrians' or 'change daytime to dusk'. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-v2.5-turbo-pro/image-to-video

Kling 2.5 Turbo Pro converts images to cinematic videos with fluid motion, dynamic effects, and precise prompt-driven motion for seamless transitions. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-video

kwaivgi/kling-v2.5-turbo-pro/text-to-video

Kling 2.5 Turbo Pro is a Text-to-Video model that delivers cinematic visuals, fluid motion, and precise prompt-to-motion responsiveness. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-video-o1/image-to-video

Kling Omni Video O1 Image-to-Video transforms static images into dynamic cinematic videos using MVL (Multi-modal Visual Language) technology. Maintains subject consistency while adding natural motion, physics simulation, and seamless scene dynamics. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

video-dubbing

kwaivgi/kling-video-to-audio

Kling Video-to-Audio auto-generates or extracts matching sound effects and audio tracks from video using KlingAI's audio generation model. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

digital-human

kwaivgi/kling-v1-ai-avatar-standard

Kling AI Avatar produces stunning AI-generated video avatars for digital identity and content creation, with on-demand video billed at $0.25 per 5 seconds. Ready-to-use REST API, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-effects

Kling Effects creates 5-second videos from a single image in styles from futuristic to realistic for social and product demos. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-elements

Kling Elements creates custom AI elements from reference images for video generation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-v2.1-i2v-master

Kling 2.1 Master is a premium image-to-video endpoint delivering fluid motion, cinematic visuals, and precise prompt-driven control. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

text-to-video

kwaivgi/kling-v2.1-t2v-master

Kling v2.1 creates cinematic 5-10s videos at 720p or 1080p from a single image or text prompt with improved motion fidelity and coherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-v2.0-i2v-master

Kling 2.0 master elevates image-to-video with improved prompts, richer character motion, better visuals and a Multi-Elements Editor. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-video

kwaivgi/kling-v2.0-t2v-master

Kling 2.0 Master is a Text-to-Video model featuring a Multi-Elements Editor, improved prompt understanding, and refined character motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

kwaivgi/kling-text-to-audio

Kling Text-to-Audio turns text prompts into custom sound effects for videos, games, and multimedia using KlingAI's audio model. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-image

kwaivgi/kling-v1/ai-multi-shot

Kling V1 AI Multi-Shot delivers top-tier image-to-image generation with cinematic visuals, accurate prompt adherence, and multi-shot consistency for ready-to-share images. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-v2.1-i2v-pro

Kling 2.1 Pro converts images to professional cinematic videos with enhanced fidelity, precise camera moves and dynamic motion control. Ready-to-use REST inference API, top performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-v2.6-pro/image-to-video

Kling 2.6 Pro delivers top-tier image-to-video generation with smooth motion, cinematic visuals, accurate prompt adherence, and native audio for ready-to-share clips. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

text-to-video

kwaivgi/kling-v2.6-pro/text-to-video

Kling 2.6 Pro delivers top-tier text-to-video generation with smooth motion, cinematic visuals, strong prompt adherence, and native audio for ready-to-share clips. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

motion-control

kwaivgi/kling-v3.0-pro/motion-control

Kling 3.0 Standard Motion Control transfers motion from reference videos to animate still images. Upload a character image and a motion clip (dance, action, gesture), and the model extracts the movement to generate smooth, realistic video. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

digital-human

kwaivgi/kling-lipsync/audio-to-video

Kling LipSync converts audio into talking head video by generating lifelike lip movements perfectly synced to the input audio. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

digital-human

kwaivgi/kling-lipsync/text-to-video

Kling TextToVideo by Kwaivgi creates videos with lifelike lip movements that precisely sync to input text for natural speaking visuals. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

audio-to-audio

kwaivgi/kling-v2.6/create-voice

Kling 2.6 Create Voice is a model can generate custom voice. Upload an audio file to create a custom voice that can be used with the voice control feature in V2.6 video generation. The audio should be clean, noise-free, with a single voice, and duration between 5-30 seconds. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing.

image-to-video

kwaivgi/kling-v1.6-multi-i2v-pro

Kling 1.6 Multi Pro boosts image-to-video generation by 195% vs Kling 1.5, with improved prompt understanding, physics and visuals. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-v1.6-multi-i2v-standard

Kling v1.6 Image-to-Video delivers 195% better results than Kling 1.5, with better prompt understanding, physics, and visual effects. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-v2.1-i2v-pro/start-end-frame

Kling v2.1 I2V Pro Start-End Frame generates cinematic Image-to-Video clips with precise start/end frame control, enhanced visual fidelity, and dynamic camera motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-video

kwaivgi/kling-video-o1-std/text-to-video

Kling Omni Video O1 (Standard) is Kuaishou's first unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Text-to-Video mode generates cinematic videos from text prompts with subject consistency, natural physics simulation, and precise semantic understanding. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

digital-human

kwaivgi/kling-v1-ai-avatar-pro

Kling AI Avatar Pro converts audio into talking video portraits; pricing is $1 for the first 5s then $0.20/s up to 600s. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

digital-human

kwaivgi/kling-v2-ai-avatar-pro

Kling V2 AI Avatar Pro generates high-quality AI avatar videos with clean detail, stable motion, and strong identity consistency—ideal for profiles, intros, and social content. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

digital-human

kwaivgi/kling-v2-ai-avatar-standard

Kling AI Avatar generates high-quality AI avatar videos for profiles, intros, and social content, delivering clean detail and cinematic motion with reliable prompt adherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-audio

kwaivgi/kling-v1-tts

Kling V1 TTS creates natural-sounding audio and supports KlingAI image, video, sound effect, virtual model, and custom AI workflows. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-v1.6-i2v-pro

Kling v1.6 i2v Pro boosts image-to-video output 195% over Kling 1.5 with improved prompt understanding, physics and visual effects for realistic output. Ready-to-use REST inference API, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-v1.6-i2v-standard

Kling 1.6 is an Image-to-Video model with 195% improvement over 1.5, with improved prompt understanding, physics and visual effects. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-video

kwaivgi/kling-v1.6-t2v-standard

Kling v1.6 boosts image-to-video quality by 195% over v1.5, with improved prompt understanding and richer physics-driven visual effects. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video

kwaivgi/kling-v2.1-i2v-standard

Kling v2.1 by Kuaishou makes 5–10s 720p/1080p videos from an image or text prompt with improved motion fidelity and visual coherence. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Kling Models

Cinematic Video Quality

  1. Advanced rendering for natural motion, lighting, and atmospheric realism
  2. High-fidelity color reproduction and detailed textures

Semantic Understanding & Prompt Control

  1. Precise interpretation of visual intent
  2. Strong alignment between text prompts and generated video output

Unified Multimodal Architecture (v3.0)

  1. All-in-one model that merges text-to-video, image-to-video, reference-based generation, and audio synthesis under a single native training framework
  2. Eliminates the need for separate tools and post-production patching — the entire creative lifecycle from generation to refinement is handled in one stream

AI Director — Multi-Shot Storytelling (v3.0)

  1. Automatically interprets scene coverage and shot patterns from a single prompt
  2. Generates structured, rhythmic sequences with professional camera transitions in one generation cycle
  3. Supports up to 6 distinct camera cuts per video

Native Audio-Visual Co-Generation (v3.0)

  1. Character dialogue, environmental sound, and music produced simultaneously with video
  2. Character-specific voice referencing for multi-speaker scenes with accurate spatial attribution
  3. Supports bilingual/multilingual dialogue within the same scene

Continuous Version Evolution

  1. From v1.6 to v3.0
  2. Each update improves realism, generation speed, and creative flexibility — with v3.0 introducing unified multimodal architecture and native audio-visual co-generation

Outstanding Value

  1. Higher visual quality at significantly lower cost
  2. 25% cheaper than the Kling 2.1 Standard model

Professional-Grade Performance

  1. Matches the response quality of the 2.5 Turbo Pro text model
  2. Although output is 720p, the visual detail remains rich and suitable for most creative and commercial use cases

Kling Model Lineup

  1. Kling v3.0 Series

The most advanced generation in the Kling family. Built on a unified multimodal architecture that consolidates video generation, image creation, and audio synthesis into a single "all-in-one" model — a paradigm shift from task-specific clip generation to full narrative-level video production.

Key Capabilities:

  1. 15-second generation with custom duration control (3–15s) — the longest native generation in Kling's history
  2. Multi-Shot AI Director with up to 6 camera cuts per video, including automatic shot transitions driven by prompt-based scene direction
  3. Native audio-visual sync — dialogue, music, and sound effects co-generated alongside video in a single pass
  4. Elements 3.0 subject consistency — locks character identity across shots, camera angles, and scene transitions with multi-image and video-based references
  5. Multi-language lip-sync in Chinese, English, Japanese, Korean, and Spanish with dialect and accent support
  6. Physics-aware generation with improved handling of complex physical interactions and reduced artifacts
  7. Native text rendering for precise lettering in signage, captions, and advertising layouts

Pro Models

  1. kling-v3.0-pro/text-to-video — Flagship T2V model with maximum visual fidelity, cinematic motion control, native audio generation, and multi-shot AI Director capabilities. Ideal for high-end creative and commercial production
  2. kling-v3.0-pro/image-to-video — Premium I2V synthesis with superior subject consistency, detailed texture preservation, and audio-visual co-generation. Best for projects requiring precise character identity maintenance across sequences

Standard Models

  1. kling-v3.0-std/text-to-video — Cost-efficient T2V generation with smooth motion, strong prompt adherence, and reliable audio-visual output. Optimized for high-volume creative workflows
  2. kling-v3.0-std/image-to-video — Fast, affordable I2V conversion with consistent detail retention and natural dynamics. Ideal for everyday content creation and rapid prototyping

Omni Models (Coming Soon)

  1. kling-v3.0-omni — Reference-heavy variant featuring enhanced Elements 3.0 with video-character reference (visual + audio capture), multi-shot storyboard workflows, and the strongest subject consistency in the lineup. Designed for serialized storytelling, brand-consistent content, and enterprise production pipelines
  2. Kling V3.0 Std Motion Control — Kling V3.0 Std Motion Control — Next-gen motion transfer from reference videos, enabling precise character animation with enhanced identity preservation and flexible orientation modes at competitive pricing.
  3. Kling V3.0 Pro Motion Control — Kling V3.0 Pro Motion Control — Premium motion transfer with superior visual fidelity, delivering professional-grade character animation with enhanced detail preservation for production-quality output.
  4. Kling v2.6 Pro Series

The Latest Pro Models

  1. kling-v2.6-pro/motion-control — Fine-grained motion guidance for controllable, stable video generation
  2. kling-v2.6-pro/image-to-video — High-fidelity, fast I2V rendering with strong detail preservation
  3. kling-v2.6-pro/text-to-video — Professional-grade T2V outputs with coherent motion and temporal consistency
  4. Kling v2.6 Std Series

The Latest Standard Models

  1. kling-v2.6-std/text-to-video — Efficient T2V generation with smooth motion and reliable visual quality at lower cost
  2. kling-v2.6-std/image-to-video — Fast, cost-effective I2V synthesis with consistent detail retention and natural dynamics
  3. kling-v2.6-std/motion-control — Cost-effective motion transfer from reference videos, enabling controlled character animation with stable identity preservation at lower cost
  4. Kling v2.5 Turbo Series

Professional Pro Models

  1. kling-v2.5-turbo-pro/image-to-video — High-fidelity, fast I2V rendering
  2. kling-v2.5-turbo-pro/text-to-video — Professional-grade T2V outputs with strong frame coherence

Standard Turbo Models

  1. kling-v2.5-turbo-std/image-to-video — Lightweight, fast, and visually refined
  2. Kling v2.1 Series

Image-to-Video Models

  1. kling-v2.1-i2v-standard — Balanced performance for everyday content creation
  2. kling-v2.1-i2v-pro — Stronger scene continuity and semantic modeling
  3. kling-v2.1-i2v-pro/start-end — Start–end guided synthesis for narrative video creation
  4. kling-v2.1-i2v-master — Flagship-level realism and cinematic tone

Text-to-Video Master Model

  1. kling-v2.1-t2v-master — Unmatched motion control for expressive text-driven video
  2. Kling v2.0 Series

Image-to-Video Master Model

  1. kling-v2.0-i2v-master — Superior detail control and enhanced realism

Text-to-Video Master Model

  1. kling-v2.0-t2v-master — Optimized lighting, depth perception, and semantic accuracy
  2. Kling v1.6 Series

Image-to-Video (I2V) Models

  1. kling-v1.6-i2v-standard — Efficient baseline model with stable, realistic motion
  2. kling-v1.6-i2v-pro — Enhanced motion realism, texture detail, and dynamic fidelity

Text-to-Video (T2V) Model

  1. kling-v1.6-t2v-standard — Strong text-to-video consistency and expressive visual output

Multi-Frame I2V Models

  1. kling-v1.6-multi-i2v-standard — Improved transition smoothness and temporal coherence
  2. kling-v1.6-multi-i2v-pro — Cinematic multi-frame synthesis for advanced storytelling

Specialized Kling Tools

Kling Effects & Enhancement Tools

  1. kling-effects — Natural motion effects, creative transitions, and style blending

Kling Lipsync Models

  1. kling-lipsync/audio-to-video — Voice-driven, perfectly aligned talking-face videos
  2. kling-lipsync/text-to-video — Script-to-lipsync generation for digital humans
  3. kwaivgi/kling-v2-ai-avatar-standard — Affordable, single-image talking avatars for everyday explainers, training clips, and social content
  4. kwaivgi/kling-v2-ai-avatar-pro — High-fidelity, studio-quality digital humans with richer motion, expressions, and lip-sync for premium productions

Audio and Speech Tools

  1. kling-v1-tts — Clear and natural text-to-speech for video narration
  2. kwaivgi/kling-video-to-audio — Auto-generated or extracted sound effects and music