Kwaivgi Kling Video O3 Std Reference To Video
Playground
Try it on WavespeedAI!Kling Omni Video O3 (Standard) Reference-to-Video generates creative videos using character, prop, or scene references from multiple viewpoints. Extracts subject features and creates new video content while maintaining identity consistency across frames. Supports audio generation. Ready-to-use REST API, best performance, no cold starts, affordable pricing.
Features
Kling Omni Video O3 — Reference-to-Video (Standard)
Kling Omni Video O3 (Standard) is Kuaishou’s advanced unified multi-modal video model, optimized for production workloads. The Reference-to-Video mode creates new video content based on subject references — maintaining character, prop, and scene identity while generating entirely new creative scenarios. Supports optional audio generation.
Key Capabilities
Multi-Reference Subject Creation
Build subjects from multiple reference viewpoints:
- Extract features from character, prop, or scene images
- Maintain consistent identity in generated videos
- Create new scenarios with familiar subjects
Subject Consistency Technology
Advanced feature extraction ensures:
- Stable character appearance across all frames
- Consistent clothing, accessories, and props
- Maintained facial features and expressions
- Coherent scene elements and backgrounds
Creative Freedom
Generate entirely new content while preserving identity:
- New poses and actions
- Different scenes and environments
- Various camera angles and movements
- Fresh creative scenarios
Audio Support
Optionally generate synchronized audio or keep original sound from reference videos.
Core Features
- Identity Lock — Subject features remain consistent throughout video
- Multi-Angle Support — Use references from various viewpoints
- Scene Flexibility — Place subjects in new environments
- Motion Control — Guide actions with text prompts
- Audio Options — Keep original sound or generate new audio
- Cost Optimized — Standard tier for production workloads
How to Use
-
Upload Reference Images/Video Provide one or more images of your subject (character, object, or scene), or a reference video.
-
Describe the Scenario Write a prompt for the new video content.
Example: “The character walking through a futuristic city at night, neon lights reflecting on wet streets”
-
Set Parameters Choose duration (3-15s), aspect ratio, and audio options.
-
Generate Receive video with your subject in the new scenario.
Pricing
| Reference Type | Price per Second |
|---|---|
| Image Reference | $0.084 |
| Video Reference | $0.126 |
$0.084/s for image reference only; $0.126/s when using video reference (1.5x multiplier).
Pro Tips
- Use multiple reference angles for better identity capture
- Provide clear, high-resolution reference images
- Describe actions and environments clearly in prompts
- Works best for characters, products, and distinct objects
Note
- If the input reference parameters include a video, then the number of reference images that can be entered will be reduced to 4.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-o3-std/reference-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"keep_original_sound": true,
"sound": false,
"aspect_ratio": "16:9",
"duration": 5
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | The positive prompt for the generation. | |
| video | string | No | - | The reference video URL. | |
| images | array | No | [] | - | Reference images. With a reference video: image elements ≤ 4; without a reference video: ≤ 7 |
| keep_original_sound | boolean | No | true | - | Whether to keep the original sound from the reference video. |
| sound | boolean | No | false | - | Whether to generate audio for the video. |
| aspect_ratio | string | No | 16:9 | 16:9, 9:16, 1:1 | The aspect ratio of the generated video. |
| duration | integer | No | 5 | 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 | The duration of the generated media in seconds (3-15). |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | string | Array of URLs to the generated content (empty when status is not completed). |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |