Special Models
• vidu/one-click-v2/mv
Transforms images and audio into dynamic music videos with camera movement and subtitle support. Designed for fast creation of professional video content in a single workflow.
• vidu/drama
A specialized model designed for short-form drama and narrative content. It is optimized for emotionally expressive scenes, story-driven pacing, character-focused performance, and dramatic visual continuity, making it ideal for mini-series, episodic content, and serialized storytelling.
• vidu/ad
A specialized commercial video generation model built for advertising and promotional content. It is designed for product showcases, brand storytelling, marketing campaigns, and conversion-focused creatives, with an emphasis on polished visuals, product clarity, and production-ready commercial output.
Image-to-Video Models
• vidu/q3/image-to-video
The newest-generation image-to-video model in the Vidu lineup, delivering best-in-class motion quality, strong structural fidelity, and cinematic realism. Ideal for complex scenes, expressive motion, and fine-grained detail preservation.
• vidu/q2-pro/image-to-video-fast
A professional-grade image-to-video model optimized for speed. It combines Q2 Pro's sharp details, stable identity preservation, and polished motion with significantly lower latency for high-volume production workflows.
• vidu/image-to-video-q2-pro
A premium image-to-video model offering sharper visual detail, more stable character identity, and refined cinematic motion. Well suited for polished production assets, hero shots, and client-facing deliverables.
• vidu/image-to-video-q2-turbo
A high-speed image-to-video model built for complex scenes and multi-character compositions. It delivers smooth, coherent motion and strong structure preservation while supporting rapid preview and iteration.
• vidu/image-to-video-q1
A high-fidelity image-to-video model with enhanced texture detail and strong portrait performance. It maintains lighting and identity consistency while generating cinematic motion and expressive character behavior.
• vidu/image-to-video-2.0
Transforms a single image into a smooth, coherent video while preserving composition, structure, and layout. Offers strong temporal stability and natural camera movement for professional editing and post-production pipelines.
• vidu/image-to-video
A lightweight and efficient baseline I2V model for rapid ideation, early-stage drafts, and social media content. It balances speed, clean motion, and structural consistency.
Text-to-Video Models
• vidu/q3/text-to-video
The most advanced text-to-video model in the Vidu family, delivering superior prompt adherence, richer scene composition, and more natural multi-character interactions for premium storytelling and commercial production.
• vidu/text-to-video-q2
A flagship text-to-video model with stronger temporal coherence, richer scene detail, and more precise camera and motion control. Designed for complex narratives, branded content, and high-end creative use cases.
• vidu/text-to-video-q1
A high-fidelity text-to-video model with richer color, sharper details, and stronger narrative continuity. Ideal for cinematic storytelling, visual branding, and polished marketing content.
• vidu/text-to-video-2.0
Generates videos directly from prompts with reliable prompt adherence, coherent multi-object scenes, and controllable camera movement. A strong choice for conceptual, narrative, and creative video generation.
• vidu/text-to-video
A streamlined baseline text-to-video model optimized for efficiency and turnaround speed. Suitable for ads, explainers, and fast iteration on text-driven concepts.
Reference-to-Video Models
• vidu/reference-to-video-q2
A reference-guided video generation model that supports multiple objects or characters within a single scene, enabling more complex compositions and richer interactions.
• vidu/reference-to-video-q1
An upgraded reference-based generator with sharper details and more faithful style and identity transfer. It reduces drift and artifacts, especially in close-ups and longer shots.
• vidu/reference-to-video-2.0
Creates videos from a reference image while preserving character likeness, visual style, wardrobe consistency, and overall scene coherence across frames.
Start-End Frame Video Models
• vidu/q2-pro/start-end-to-video-fast
A professional-grade start-end interpolation model optimized for speed. It combines reinforced temporal coherence with fast generation for efficient production workflows.
• vidu/start-end-to-video-q2-pro
A high-end model focused on precise motion control and reinforced temporal coherence. It generates stable intermediate frames while closely following user-defined start and end constraints.
• vidu/start-end-to-video-q2-turbo
A fast start-end video model built for rapid iteration and preview. It preserves subject integrity and core visual coherence while reducing generation latency.
• vidu/start-end-to-video-q1
Improves motion smoothness and narrative continuity, producing more natural transitions between poses, camera positions, and scene states.
• vidu/start-end-to-video-2.0
Synthesizes smooth motion between user-defined start and end frames while respecting scene geometry and composition. Ideal for transitions, reveals, and structured motion design.
• vidu/start-end-to-video
A compact baseline model for simple start-end interpolation and quick previews. Suitable for basic transitions, animatics, and fast storyboard development.
Image Models
• vidu/text-to-image-q2
A high-resolution cinematic text-to-image model for generating polished hero images, thumbnails, posters, and key visuals directly from prompts.
• vidu/reference-to-image-q2
A reference-guided image generation model that supports up to seven input images plus a text prompt to create new high-resolution visuals while preserving subject identity and composition.






































