
text-to-video
Idle
Your request will cost $0.42 per run.
For $10 you can run this model approximately 23 times.
One more thing::
Kling Omni Video O1 is Kuaishou's unified multi-modal video generation model, optimized for stable production use and cost efficiency.
The Text-to-Video mode transforms natural language prompts into high-quality videos with coherent motion, accurate semantic understanding, and consistent visual output.
The model supports multiple video generation and editing workflows within a single system:
The model interprets instructions through MVL, enabling understanding of:
Maintains stable characters, objects, and scene attributes across frames, ensuring reliable and repeatable results suitable for production workflows.
Write Your Prompt
Describe the scene, action, camera movement, and overall mood.
Example: "A young woman walking through a neon-lit Tokyo street at night, rain reflecting city lights, cinematic tracking shot"
Set Parameters
Choose the desired duration, and aspect ratio.
Generate
Submit the request and receive a coherent video generated from text.
| duration | price |
|---|---|
| 5s | $0.42 |
| 10s | $0.84 |
Billed based on the selected output duration. Pricing is optimized for standard production workloads.
kwaivgi/kling-video-o1-std — Video Edit — Edit videos with natural-language instructions for precise, context-aware changes like object removal, scene adjustments, and style refinement while preserving motion consistency.
kwaivgi/kling-video-o1-std — Reference to Video — Generate new videos guided by a reference video to match its style, identity, or motion patterns, ideal for consistent visual storytelling and content iteration.
kwaivgi/kling-video-o1-std — Image to Video — Animate a single image into a high-quality video clip with smooth motion and coherent scene continuity, perfect for marketing creatives and social content.