Vidu is Shengshu Technology's advanced video generation suite, combining the Q3, Q2, Q1, and 2.0 series models. Built on open-source diffusion backbones and trained on large-scale, high-quality datasets, Vidu delivers strong performance across a wide range of video creation tasks. Its models offer precise control, consistent visual quality, and robust temporal stability, making Vidu suitable for professional, production-grade workflows.
Image-to-Video Models
• vidu/q3/image-to-video The newest-generation image-to-video model with best-in-class motion quality, structural fidelity, and cinematic realism. Sets a new benchmark for I2V across complex scenes and fine-grained detail preservation.
• vidu/q2-pro/image-to-video-fast Professional-grade image-to-video generation at turbo speed. Combines Q2-Pro's sharp detail and stable identity with significantly reduced latency for high-volume production pipelines.
• vidu/image-to-video-q2-pro A professional-grade image-to-video model offering sharper detail, more stable character identity, and refined cinematic motion. Suited for polished production assets, hero shots, and client-facing deliverables.
• vidu/image-to-video-q2-turbo A high-speed image-to-video model for complex scenes and multi-character shots. Delivers smooth, coherent motion and solid structure preservation while enabling near real-time preview and refinement.
• vidu/image-to-video-q1 A premium image-to-video model with enhanced texture detail and superior portrait handling. Maintains lighting and identity consistency while generating cinematic motion and expressive character performance.
• vidu/image-to-video-2.0 Transforms a single image into a smooth, coherent video while preserving structure, composition, and layout. Provides strong temporal stability and natural camera motion for professional post-production and editing pipelines.
• vidu/image-to-video A lightweight, fast I2V model for rapid drafts, ideation, and social media content. Balances speed and structural preservation, producing clean clips with minimal artifacts.
Text-to-Video Models
• vidu/q3/text-to-video The most advanced text-to-video model in the Vidu lineup. Delivers superior prompt adherence, richer scene composition, and more natural multi-character interactions for high-end creative and commercial storytelling.
• vidu/text-to-video-q2 A flagship text-to-video model with stronger temporal coherence, richer scene detail, and more precise camera and motion control. Designed for complex, multi-character narratives and high-end commercial storytelling.
• vidu/text-to-video-q1 A high-fidelity T2V model offering richer color, sharper detail, and stronger narrative continuity. Ideal for cinematic storytelling, branding, and visually polished marketing assets.
• vidu/text-to-video-2.0 Generates videos directly from text prompts with reliable prompt adherence, coherent multi-object scenes, and controllable camera motion. Well suited for high-quality conceptual and narrative video generation.
• vidu/text-to-video A baseline T2V option optimized for efficiency and turnaround speed. Designed for ads, explainers, and straightforward text-driven concepts where fast iteration is key.
Reference-to-Video Models
• vidu/reference-to-video-q2 Supports multiple distinct objects or characters interacting within a single video, enabling complex, reference-guided scene compositions.
• vidu/reference-to-video-q1 An upgraded reference-based generator with sharper details and more faithful style and identity transfer. Reduces drift and artifacts, especially in close-ups and longer shots.
• vidu/reference-to-video-2.0 Creates videos guided by a reference image, ensuring accurate character likeness, stable style control, and consistent wardrobe and appearance across frames.
Start-End Frame Video Models
• vidu/q2-pro/start-end-to-video-fast Professional-grade start-end interpolation at turbo speed. Combines Q2-Pro's reinforced temporal coherence with drastically reduced generation time for rapid production workflows.
• vidu/start-end-to-video-q2-pro A professional-grade model focused on reinforced temporal coherence and precise motion control. Generates stable intermediate frames while closely aligning with user-specified start and end constraints.
• vidu/start-end-to-video-q2-turbo A high-speed variant optimized for rapid iteration and preview. Preserves core coherence and subject integrity while significantly reducing generation latency.
• vidu/start-end-to-video-q1 Enhances narrative continuity and motion smoothness, producing more natural easing between poses, camera positions, and scene states.
• vidu/start-end-to-video-2.0 Synthesizes smooth motion between user-defined start and end frames while respecting overall scene geometry and layout. Ideal for transitions, reveals, and structured motion design.
• vidu/start-end-to-video A compact baseline model for simple start–end interpolation and quick previews. Suitable for basic transitions, animatics, and fast storyboard development.
Image Models
• vidu/text-to-image-q2 High-resolution cinematic text-to-image model for generating polished hero shots, thumbnails, and key visuals directly from prompts.
• vidu/reference-to-image-q2 Reference-guided image generator that uses up to seven input images plus a prompt to create new, high-res shots that preserve subject identity and composition.
Special Models
• vidu/one-click-v2/mv Vidu One-Click V2 MV transforms images and audio into videos with camera movements and subtitle support. Create professional video content with dynamic shots and text overlays in one click.
• vidu/template/halloween A themed template model for stylized seasonal video content. Apply pre-designed creative templates to quickly generate themed videos with minimal effort.
