Kling Video O3 Pro Reference-to-Video
Kling Video O3 Pro Reference-to-Video generates premium video from reference images with optional video guidance. Upload reference images to establish character identity and appearance, optionally provide a reference video for motion guidance, and describe the scene — the model produces top-tier cinematic video with identity consistency.
Why Choose This?
-
O3 Pro quality
The highest visual fidelity and motion realism in the Kling family.
-
Multi-reference images
Upload up to 7 reference images (or up to 4 with a reference video).
-
Video-guided generation
Optional reference video for motion and scene guidance.
-
Keep original sound
Preserve the audio from the reference video in the output.
-
Sound generation
Optional AI-generated sound effects when no reference video is provided.
Parameters
| Parameter | Required | Description |
|---|
| prompt | Yes | Text description of the video scene and motion |
| video | No | Reference video for motion guidance |
| images | No | Reference images: up to 4 with video, up to 7 without (click "+ Add Item") |
| keep_original_sound | No | Keep audio from the reference video (default: enabled) |
| sound | No | Generate AI audio (only when no reference video, default: disabled) |
| aspect_ratio | No | Output ratio: 16:9 (default), 9:16, 1:1 |
| duration | No | Video length: 3-15 seconds (default: 5) |
How to Use
- Write your prompt — describe the scene, characters, and action.
- Upload reference video (optional) — provide a video for motion guidance.
- Upload reference images — add character or scene references.
- Configure audio — keep original sound from video, or enable AI sound generation.
- Select aspect ratio — match your target platform.
- Set duration — choose any length from 3 to 15 seconds.
- Run — submit and download your video.
Pricing
| Duration | Images only | Images only + Sound | With reference video |
|---|
| 3s | $0.72 | $0.90 | $1.08 |
| 5s | $1.20 | $1.50 | $1.80 |
| 10s | $2.40 | $3.00 | $3.60 |
| 15s | $3.60 | $4.50 | $5.40 |
Billing Rules
- Base rate: $1.20 per 5 seconds
- With reference video: 1.5× multiplier
- With AI sound (no video): 1.25× multiplier
Best Use Cases
- Character Consistency — Generate videos with identity-consistent characters.
- Video Remixing — Use reference video for motion guidance with new characters.
- Marketing & Ads — Create promotional videos featuring specific people or products.
- Storytelling — Produce narrative scenes with consistent character appearance.
- Long-Form Scenes — Up to 15 seconds for extended scene development.
Pro Tips
- Use multiple reference images from different angles for better identity preservation.
- When using a reference video, the image limit is 4; without a video, you can use up to 7.
- Enable keep_original_sound to preserve audio from your reference video.
- Sound generation is only available when no reference video is provided.
- Use shorter durations (3-5s) for testing, longer (10-15s) for final production.
- Match aspect ratio to your platform: 16:9 for YouTube, 9:16 for TikTok/Reels, 1:1 for Instagram.
Notes
- Only prompt is required; other parameters are optional.
- Duration supports any value from 3 to 15 seconds.
- Reference images limit: up to 4 with video, up to 7 without.
- When a reference video is provided, sound generation is replaced by keep_original_sound.
- Ensure uploaded image and video URLs are publicly accessible.
Related Models
- Kling Video O3 Pro Image-to-Video — O3 Pro quality single image to video.
- Kling V3.0 Pro Image-to-Video — V3.0 Pro quality at lower cost.
- Kling V3.0 Pro Text-to-Video — Pro quality text-to-video.