Home/Explore/Kling Models/kwaivgi/kling-v2.6-pro/text-to-video
text-to-video

text-to-video

Kling 2.6 Pro | Text To Video with Sound | WaveSpeedAI

kwaivgi/kling-v2.6-pro/text-to-video

Kling 2.6 Pro: Top-tier text-to-video generation with unmatched motion fluidity, cinematic visuals, and exceptional prompt precision, offering superior quality at an unbeatable value, and native audio generation.

Whether sound is generated simultaneously when generating a video

Idle

Your request will cost $0.35 per run.

For $10 you can run this model approximately 28 times.

One more thing::

ExamplesView all

README

Kling 2.6 Audio Model - Text to Video

Kling 2.6 introduces audio–video co-generation for the very first time, pairing superb visuals with natural, native audio to deliver a fully coherent, immersive experience — more than just a video clip.

Model Highlights

  • Audio–video co-generation introduced for the very first time in the Kling series.
  • Voices that sync flawlessly with the character — natural, native, and instantly immersive.
  • Generates an entire experience, not just a clip — coherent looking & sounding output that opens up rich narrative possibilities.
  • When superb visuals are paired with natural voiceovers, matching sound effects, and immersive ambient sounds, the 2.6 model generates coherent looking & sounding outputs as an entire experience — more than a video clip.

Aspect Ratios

Kling 2.6 supports the following aspect ratios:

  • 16:9 – Standard landscape video (YouTube, web, most displays)
  • 9:16 – Vertical video (TikTok, Reels, Shorts, Stories)
  • 1:1 – Square video (feeds, social posts, ads)

Use Cases

  • Marketing and launch videos with native, character-synced voiceover.
  • Storytelling and narrative content where visuals and audio must feel like one.
  • Product explainers and demo videos with both strong visuals and natural narration.
  • Cinematic social media content with immersive ambient sound and effects.

Price

  • Without Audio 5s video$0.35
  • Without Audio 10s video$0.70
  • With Audio 5s video$0.70
  • With Audio 10s video$1.40

How to Use

  1. Describe your desired scene, characters, motion, and audio mood (voiceover tone, ambience, SFX).
  2. Choose an aspect ratio (16:9, 9:16, or 1:1) and configure video duration (e.g., 5s or 10s).
  3. Run generation to create a coherent audio–video experience — See the Sound, Hear the Visual.
  4. Refine and regenerate as needed for alternative versions or platforms.