Kwaivgi Kling Video O1 Std Text To Video

Playground

Kling Omni Video O1 (Standard) is Kuaishou’s first unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Text-to-Video mode generates cinematic videos from text prompts with subject consistency, natural physics simulation, and precise semantic understanding. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

Features

Kling Omni Video O1 — Text-to-Video (Standard)

Kling Omni Video O1 is Kuaishou’s unified multi-modal video generation model, optimized for stable production use and cost efficiency.
The Text-to-Video mode transforms natural language prompts into high-quality videos with coherent motion, accurate semantic understanding, and consistent visual output.

Why Kling Video O1 (Standard)

Unified Creative Engine

The model supports multiple video generation and editing workflows within a single system:

Text-to-video generation
Image-to-video transformation
Reference-based video creation
Video editing and modification
Shot extension and scene continuation

The model interprets instructions through MVL, enabling understanding of:

Natural language descriptions
Visual context and references
Subject identity and appearance
Scene structure and motion dynamics

Subject Consistency

Maintains stable characters, objects, and scene attributes across frames, ensuring reliable and repeatable results suitable for production workflows.

Core Features

Cinematic-quality video generation with natural motion
Stable temporal consistency across the entire sequence
Accurate semantic understanding of text prompts
Support for multiple resolutions and output durations
Standard optimization for balanced quality, speed, and cost

How to Use

Write Your Prompt
Describe the scene, action, camera movement, and overall mood.

Example: “A young woman walking through a neon-lit Tokyo street at night, rain reflecting city lights, cinematic tracking shot”
Set Parameters
Choose the desired duration, and aspect ratio.
Generate
Submit the request and receive a coherent video generated from text.

Pricing

duration	price
5s	$0.42
10s	$0.84

Billed based on the selected output duration. Pricing is optimized for standard production workloads.

Pro Tips

Use clear and descriptive prompts
Specify camera movement and framing for better motion quality
Include lighting, environment, and atmosphere details
Suitable for large-scale generation and cost-sensitive use cases

Kling O1 series models

kwaivgi/kling-video-o1-std — Video Edit — Edit videos with natural-language instructions for precise, context-aware changes like object removal, scene adjustments, and style refinement while preserving motion consistency.
kwaivgi/kling-video-o1-std — Reference to Video — Generate new videos guided by a reference video to match its style, identity, or motion patterns, ideal for consistent visual storytelling and content iteration.
kwaivgi/kling-video-o1-std — Image to Video — Animate a single image into a high-quality video clip with smooth motion and coherent scene continuity, perfect for marketing creatives and social content.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-o1-std/text-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "aspect_ratio": "16:9",
    "duration": 5
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
prompt	string	Yes		-	The positive prompt for the generation.
aspect_ratio	string	No	16:9	16:9, 9:16, 1:1	The aspect ratio of the generated video.
duration	integer	No	5	5, 10	The duration of the generated media in seconds.

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.has_nsfw_contents	array	Array of boolean values indicating NSFW detection for each output
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	string	Array of URLs to the generated content (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Kwaivgi Kling Video O1 Std Reference To Video Kwaivgi Kling Video O1 Std Video Edit