Kwaivgi Kling Video O1 Std Image To Video
Playground
Try it on WavespeedAI!Kling Omni Video O1 Image-to-Video (Standard) turns static images into dynamic, high-quality videos while preserving subject identity and visual/temporal consistency. It adds natural motion, realistic physics, and smooth scene dynamics, and supports flexible clip durations when reference frames are provided. Built for stable production use and cost efficiency with a ready-to-use REST API, fast response, no cold starts, and predictable pricing.
Features
Kling Omni Video O1 - Text-to-Video (Standard)
Kling Omni Video O1 (Standard) is Kuaishou’s unified multi-modal video generation model, optimized for cost efficiency and stable production use. The Text-to-Video mode transforms natural language prompts into high-quality videos with coherent motion, strong scene understanding, and cinematic results.
Key Capabilities
Intelligent Text-to-Video Generation
Generate videos directly from text descriptions:
- Converts natural language into dynamic video scenes
- Understands actions, environments, and visual styles
- Produces smooth, temporally consistent motion
Scene and Motion Coherence
Advanced video reasoning ensures:
- Logical object interactions and movement flow
- Stable scene structure across frames
- Consistent lighting, colors, and atmosphere
Multi-Modal Prompt Understanding
Use descriptive prompts to control:
- Subject appearance and actions
- Camera movement and framing
- Mood, style, and scene dynamics
Core Features
-
Text-Driven Video Synthesis — From prompt to video in one step
-
Temporal Consistency — Stable visuals across the entire sequence
-
Cinematic Motion — Natural movement and camera dynamics
-
Standard Optimization — Balanced quality, speed, and cost
-
Adaptive Duration Control — Video length adapts based on input conditions
- When
last_imageis provided, supports flexible durations from 3 to 10 seconds - Without
last_image, generation is limited to 5s or 10s for optimal stability
- When
How to Use
-
Enter Your Text Prompt Describe the scene, subject, and actions in natural language.
-
Refine with Details (Optional) Add style, camera motion, or environment cues.
Example: “A futuristic city at night, neon lights reflecting on wet streets, slow cinematic camera pan”
-
Set Parameters Choose video duration and whether to use start and end frames for generation.
-
Generate Receive a coherent, dynamic video generated entirely from text.
Pricing
| duration | price |
|---|---|
| per second | $0.084 |
Pro Tips
- Use clear, descriptive prompts for best results
- Specify actions and camera movement for more dynamic videos
- Combine environment and motion details for cinematic quality
- Ideal for large-scale generation and cost-sensitive use cases
Kling O1 series models
-
kwaivgi/kling-video-o1-std — Video Edit — Edit videos with natural-language instructions for precise, context-aware changes like object removal, scene adjustments, and style refinement while preserving motion consistency.
-
kwaivgi/kling-video-o1-std — Reference to Video — Generate new videos guided by a reference video to match its style, identity, or motion patterns, ideal for consistent visual storytelling and content iteration.
-
kwaivgi/kling-video-o1-std — Text to Video — Create videos directly from text prompts with strong prompt adherence and cinematic motion, great for rapid prototyping, ads, and creative concept exploration.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-o1-std/image-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"duration": 5
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | The positive prompt for the generation. | |
| image | string | Yes | - | first_frame is the first frame | |
| last_image | string | No | - | - | last_frame is the last frame. |
| duration | integer | No | 5 | 3, 4, 5, 6, 7, 8, 9, 10 | The duration of the generated media. Only 5s or 10s are supported when last_image is not used. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | string | Array of URLs to the generated content (empty when status is not completed). |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |