Vidu Q3 與 Q3 Pro 模型 5 折 · 僅限 WaveSpeedAI | 5月20日 – 6月2日

Kling Video O1 Std Image to Video

kwaivgi /

Kling Omni Video O1 Image-to-Video (Standard) turns static images into dynamic, high-quality videos while preserving subject identity and visual/temporal consistency. It adds natural motion, realistic physics, and smooth scene dynamics, and supports flexible clip durations when reference frames are provided. Built for stable production use and cost efficiency with a ready-to-use REST API, fast response, no cold starts, and predictable pricing.

image-to-video
輸入

拖放檔案或點擊上傳

preview

拖放檔案或點擊上傳

就緒

$0.42每次運行·~23 / $10

下一步:

示例查看全部

The camera slowly pushes in as she sketches, the pencil making faint scratching sounds beneath the calls of distant gulls.

The woman walks down the lantern-lit alley, rain tapping gently on her umbrella as she approaches the café entrance. She steps inside, closing the umbrella beside her, and walks to the counter. After receiving the cup, she carries it to a window seat, sits down, and wraps her hands around the warm drink while watching the rain outside.

The hands slowly peel away the transparent wrapping with a gentle crinkling sound, then lift the lid to reveal a shiny new device nestled in foam.

The knife slices smoothly through the orange

A child floats through a dream, surrounded by glowing dandelions and talking animals, against a backdrop of ever-shifting starry skies and candy-colored clouds.

A semi-mechanical, semi-biological sea creature swims in the deep ocean, its body composed of glowing circuits and mechanical bones, surrounded by the ruins of a submerged futuristic city.

相關模型

README

Kling Omni Video O1 - Text-to-Video (Standard)

Kling Omni Video O1 (Standard) is Kuaishou's unified multi-modal video generation model, optimized for cost efficiency and stable production use. The Text-to-Video mode transforms natural language prompts into high-quality videos with coherent motion, strong scene understanding, and cinematic results.

Key Capabilities

Intelligent Text-to-Video Generation

Generate videos directly from text descriptions:

  • Converts natural language into dynamic video scenes
  • Understands actions, environments, and visual styles
  • Produces smooth, temporally consistent motion

Scene and Motion Coherence

Advanced video reasoning ensures:

  • Logical object interactions and movement flow
  • Stable scene structure across frames
  • Consistent lighting, colors, and atmosphere

Multi-Modal Prompt Understanding

Use descriptive prompts to control:

  • Subject appearance and actions
  • Camera movement and framing
  • Mood, style, and scene dynamics

Core Features

  • Text-Driven Video Synthesis — From prompt to video in one step

  • Temporal Consistency — Stable visuals across the entire sequence

  • Cinematic Motion — Natural movement and camera dynamics

  • Standard Optimization — Balanced quality, speed, and cost

  • Adaptive Duration Control — Video length adapts based on input conditions

  • When last_image is provided, supports flexible durations from 3 to 10 seconds

  • Without last_image, generation is limited to 5s or 10s for optimal stability

How to Use

  1. Enter Your Text Prompt Describe the scene, subject, and actions in natural language.

  2. Refine with Details (Optional) Add style, camera motion, or environment cues.

Example: "A futuristic city at night, neon lights reflecting on wet streets, slow cinematic camera pan"

  1. Set Parameters Choose video duration and whether to use start and end frames for generation.

  2. Generate Receive a coherent, dynamic video generated entirely from text.

Pricing

durationprice
per second$0.084

Pro Tips

  • Use clear, descriptive prompts for best results
  • Specify actions and camera movement for more dynamic videos
  • Combine environment and motion details for cinematic quality
  • Ideal for large-scale generation and cost-sensitive use cases

Kling O1 series models

無障礙:本網站使用的 AI 模型由第三方提供。

Kling Video O1 Std Image To Video API — Quick start

Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-o1-std/image-to-video with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Kling Video O1 Std Image To Video below.

HTTP example
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-o1-std/image-to-video" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{
    "prompt": "A cinematic shot of a city at sunset, soft golden light",
    "image": "https://example.com/your-input.jpg",
    "duration": 5
}'

# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# When status is "completed", read the output from data.outputs[0].
Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');

const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env

const result = await client.run("kwaivgi/kling-video-o1-std/image-to-video", {
        "prompt": "A cinematic shot of a city at sunset, soft golden light",
        "image": "https://example.com/your-input.jpg",
        "duration": 5
});

console.log(result.outputs[0]); // → URL of the generated output
Python example
# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "kwaivgi/kling-video-o1-std/image-to-video",
    {
    "prompt": "A cinematic shot of a city at sunset, soft golden light",
    "image": "https://example.com/your-input.jpg",
    "duration": 5
}
)

print(output["outputs"][0])  # → URL of the generated output

Kling Video O1 Std Image To Video API — Frequently asked questions

What is the Kling Video O1 Std Image To Video API?

Kling Video O1 Std Image To Video is a Kuaishou model for video generation from images, exposed as a REST API on WaveSpeedAI. Kling Omni Video O1 Image-to-Video (Standard) turns static images into dynamic, high-quality videos while preserving subject identity and visual/temporal consistency. It adds natural motion, realistic physics, and smooth scene dynamics, and supports flexible clip durations when reference frames are provided. Built for stable production use and cost efficiency with a ready-to-use REST API, fast response, no cold starts, and predictable pricing. You can call it programmatically or try it from the playground above.

How do I call the Kling Video O1 Std Image To Video API?

POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/kwaivgi/kwaivgi-kling-video-o1-std-image-to-video.

How much does Kling Video O1 Std Image To Video cost per run?

Kling Video O1 Std Image To Video starts at $0.42 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.

What inputs does Kling Video O1 Std Image To Video accept?

Key inputs: `prompt`, `image`, `duration`, `last_image`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/kwaivgi/kwaivgi-kling-video-o1-std-image-to-video.

How long does Kling Video O1 Std Image To Video take to generate?

Average end-to-end generation time on WaveSpeedAI is around 51 seconds per request — measured across recent runs. Queue time scales with global demand; live status is visible in the prediction record.

Can I use Kling Video O1 Std Image To Video outputs commercially?

Commercial usage rights depend on the model's license, set by its provider (Kuaishou). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.