
text-to-video
Idle
Your request will cost $0.4 per run.
For $10 you can run this model approximately 25 times.
One more thing:
Vidu Q1 Text-to-Video is a high-end video generation model built on Shengshu Technology’s Vidu Q-series architecture. It transforms natural language prompts into cinematic 720p videos with exceptional realism, diverse motion, and consistent visual fidelity — optimized for creative professionals and production use.
High-Fidelity Generation Produces visually rich, detailed videos with natural lighting, textures, and depth.
Motion Diversity Captures a wide range of subject and camera motion — from subtle gestures to complex dynamic scenes.
Temporal Consistency Ensures frame-to-frame coherence and smooth motion transitions without flicker or distortion.
Prompt-Driven Storytelling Understands complex prompts, generating coherent narrative flow and visual alignment with text.
Cinematic Quality (720p) Designed for high-quality visual outputs suitable for editing, marketing, and storytelling.
prompt — Describe your desired scene, action, or atmosphere.
movement_amplitude — Control the motion intensity:
auto – Adaptive movement based on scene content.small – Subtle or static scenes.medium – Balanced motion.large – Dramatic or action-focused motion.style - choose general or anime.
duration — 5 seconds per generation.
seed — Optional; set a fixed number for reproducible results.
| Resolution | Duration | Cost per Clip |
|---|---|---|
| 720p | 5s | $0.40 |