Kling V3.0 Pro Image-to-Video
Kling V3.0 Pro Image-to-Video is Kuaishou's highest-quality image-to-video model. Upload a reference image and describe the motion — the model generates cinematic-grade video with superior visual fidelity, optional synchronized sound, voice support, and start-to-end frame guidance.
Why Choose This?
-
Pro-tier quality
Superior visual detail, motion smoothness, and cinematic rendering compared to Standard.
-
Start-end frame guidance
Optional end image for controlled transitions between two frames.
-
Sound generation
Optional synchronized sound effects generated alongside the video.
-
Voice list support
Add up to 2 custom voice entries for character dialogue.
-
CFG scale control
Fine-tune the balance between prompt adherence and creative freedom.
Parameters
| Parameter | Required | Description |
|---|
| prompt | No | Text description of the desired motion and action |
| negative_prompt | No | Elements to exclude from generation |
| image | Yes | Start frame image to animate (URL or upload) |
| end_image | No | End frame image for guided transitions |
| duration | No | Video length: 5 or 10 seconds (default: 5) |
| cfg_scale | No | Prompt adherence strength (default: 0.5) |
| sound | No | Generate synchronized sound (default: disabled) |
| voice_list | No | Custom voice entries, up to 2 (click "+ Add Item") |
How to Use
- Upload your image — provide the reference image to animate.
- Write your prompt (optional) — describe the motion, camera movement, and action.
- Upload end image (optional) — provide an end frame for guided transitions.
- Add negative prompt (optional) — specify what you want to avoid.
- Set duration — 5 seconds or 10 seconds.
- Adjust cfg_scale (optional) — higher for stricter prompt following, lower for more freedom.
- Enable sound (optional) — generate synchronized audio with the video.
- Add voices (optional) — add up to 2 voice entries for dialogue.
- Run — submit and download your video.
Pricing
| Duration | Sound Off | Sound On |
|---|
| 5s | $0.24 | $0.36 |
| 10s | $0.48 | $0.72 |
Billing Rules
- Sound multiplier: disabled = 1×, enabled = 1.5×
Best Use Cases
- High-End Photo Animation — Bring images to life with maximum visual quality.
- Scene Transitions — Use start and end frames for smooth cinematic transitions.
- Professional Marketing — Premium promotional videos with sound from product photos.
- Short Films — Create film-quality animated scenes with dialogue.
- Brand Content — Polished visual storytelling from still imagery.
Pro Tips
- Use clear, descriptive prompts with specific motion and cinematography details.
- Add an end_image for controlled transitions between two visual states.
- Enable sound for a complete video experience with synchronized audio.
- Use negative prompts to avoid artifacts (e.g., "blurry, low quality, distorted").
- Lower cfg_scale for more creative variation, higher for strict prompt adherence.
- Use Standard tier for drafts and testing, Pro for final production.
Notes
- Image is the only required field; prompt is optional but recommended.
- Duration options are 5 or 10 seconds only.
- Voice list supports a maximum of 2 entries.
- Sound generation increases cost by 1.5×.
- Ensure uploaded image URLs are publicly accessible.
Related Models
- Kling V3.0 Pro Text-to-Video — Pro quality text-to-video generation.
- Kling V3.0 Standard Image-to-Video — Standard tier at lower cost.
- Kling V3.0 Standard Text-to-Video — Standard tier text-to-video.