Vidu AI – Vidu Q3 & Q-Series Video Generator, Vidu AD & Drama Agent models are online!

Special Models

• vidu/one-click-v2/mv

Transforms images and audio into dynamic music videos with camera movement and subtitle support. Designed for fast creation of professional video content in a single workflow.

• vidu/drama

A specialized model designed for short-form drama and narrative content. It is optimized for emotionally expressive scenes, story-driven pacing, character-focused performance, and dramatic visual continuity, making it ideal for mini-series, episodic content, and serialized storytelling.

• vidu/ad

A specialized commercial video generation model built for advertising and promotional content. It is designed for product showcases, brand storytelling, marketing campaigns, and conversion-focused creatives, with an emphasis on polished visuals, product clarity, and production-ready commercial output.

Image-to-Video Models

• vidu/q3/image-to-video

The newest-generation image-to-video model in the Vidu lineup, delivering best-in-class motion quality, strong structural fidelity, and cinematic realism. Ideal for complex scenes, expressive motion, and fine-grained detail preservation.

• vidu/q2-pro/image-to-video-fast

A professional-grade image-to-video model optimized for speed. It combines Q2 Pro's sharp details, stable identity preservation, and polished motion with significantly lower latency for high-volume production workflows.

• vidu/image-to-video-q2-pro

A premium image-to-video model offering sharper visual detail, more stable character identity, and refined cinematic motion. Well suited for polished production assets, hero shots, and client-facing deliverables.

• vidu/image-to-video-q2-turbo

A high-speed image-to-video model built for complex scenes and multi-character compositions. It delivers smooth, coherent motion and strong structure preservation while supporting rapid preview and iteration.

• vidu/image-to-video-q1

A high-fidelity image-to-video model with enhanced texture detail and strong portrait performance. It maintains lighting and identity consistency while generating cinematic motion and expressive character behavior.

• vidu/image-to-video-2.0

Transforms a single image into a smooth, coherent video while preserving composition, structure, and layout. Offers strong temporal stability and natural camera movement for professional editing and post-production pipelines.

• vidu/image-to-video

A lightweight and efficient baseline I2V model for rapid ideation, early-stage drafts, and social media content. It balances speed, clean motion, and structural consistency.

Text-to-Video Models

• vidu/q3/text-to-video

The most advanced text-to-video model in the Vidu family, delivering superior prompt adherence, richer scene composition, and more natural multi-character interactions for premium storytelling and commercial production.

• vidu/text-to-video-q2

A flagship text-to-video model with stronger temporal coherence, richer scene detail, and more precise camera and motion control. Designed for complex narratives, branded content, and high-end creative use cases.

• vidu/text-to-video-q1

A high-fidelity text-to-video model with richer color, sharper details, and stronger narrative continuity. Ideal for cinematic storytelling, visual branding, and polished marketing content.

• vidu/text-to-video-2.0

Generates videos directly from prompts with reliable prompt adherence, coherent multi-object scenes, and controllable camera movement. A strong choice for conceptual, narrative, and creative video generation.

• vidu/text-to-video

A streamlined baseline text-to-video model optimized for efficiency and turnaround speed. Suitable for ads, explainers, and fast iteration on text-driven concepts.

Reference-to-Video Models

• vidu/reference-to-video-q2

A reference-guided video generation model that supports multiple objects or characters within a single scene, enabling more complex compositions and richer interactions.

• vidu/reference-to-video-q1

An upgraded reference-based generator with sharper details and more faithful style and identity transfer. It reduces drift and artifacts, especially in close-ups and longer shots.

• vidu/reference-to-video-2.0

Creates videos from a reference image while preserving character likeness, visual style, wardrobe consistency, and overall scene coherence across frames.

Start-End Frame Video Models

• vidu/q2-pro/start-end-to-video-fast

A professional-grade start-end interpolation model optimized for speed. It combines reinforced temporal coherence with fast generation for efficient production workflows.

• vidu/start-end-to-video-q2-pro

A high-end model focused on precise motion control and reinforced temporal coherence. It generates stable intermediate frames while closely following user-defined start and end constraints.

• vidu/start-end-to-video-q2-turbo

A fast start-end video model built for rapid iteration and preview. It preserves subject integrity and core visual coherence while reducing generation latency.

• vidu/start-end-to-video-q1

Improves motion smoothness and narrative continuity, producing more natural transitions between poses, camera positions, and scene states.

• vidu/start-end-to-video-2.0

Synthesizes smooth motion between user-defined start and end frames while respecting scene geometry and composition. Ideal for transitions, reveals, and structured motion design.

• vidu/start-end-to-video

A compact baseline model for simple start-end interpolation and quick previews. Suitable for basic transitions, animatics, and fast storyboard development.

Image Models

• vidu/text-to-image-q2

A high-resolution cinematic text-to-image model for generating polished hero images, thumbnails, posters, and key visuals directly from prompts.

• vidu/reference-to-image-q2

A reference-guided image generation model that supports up to seven input images plus a text prompt to create new high-resolution visuals while preserving subject identity and composition.

Special Models

• vidu/one-click-v2/mv

Transforms images and audio into dynamic music videos with camera movement and subtitle support. Designed for fast creation of professional video content in a single workflow.

• vidu/drama

• vidu/ad

Image-to-Video Models

• vidu/q3/image-to-video

• vidu/q2-pro/image-to-video-fast

• vidu/image-to-video-q2-pro

• vidu/image-to-video-q2-turbo

• vidu/image-to-video-q1

• vidu/image-to-video-2.0

• vidu/image-to-video

A lightweight and efficient baseline I2V model for rapid ideation, early-stage drafts, and social media content. It balances speed, clean motion, and structural consistency.

Text-to-Video Models

• vidu/q3/text-to-video

• vidu/text-to-video-q2

• vidu/text-to-video-q1

A high-fidelity text-to-video model with richer color, sharper details, and stronger narrative continuity. Ideal for cinematic storytelling, visual branding, and polished marketing content.

• vidu/text-to-video-2.0

• vidu/text-to-video

A streamlined baseline text-to-video model optimized for efficiency and turnaround speed. Suitable for ads, explainers, and fast iteration on text-driven concepts.

Reference-to-Video Models

• vidu/reference-to-video-q2

A reference-guided video generation model that supports multiple objects or characters within a single scene, enabling more complex compositions and richer interactions.

• vidu/reference-to-video-q1

An upgraded reference-based generator with sharper details and more faithful style and identity transfer. It reduces drift and artifacts, especially in close-ups and longer shots.

• vidu/reference-to-video-2.0

Creates videos from a reference image while preserving character likeness, visual style, wardrobe consistency, and overall scene coherence across frames.

Start-End Frame Video Models

• vidu/q2-pro/start-end-to-video-fast

A professional-grade start-end interpolation model optimized for speed. It combines reinforced temporal coherence with fast generation for efficient production workflows.

• vidu/start-end-to-video-q2-pro

A high-end model focused on precise motion control and reinforced temporal coherence. It generates stable intermediate frames while closely following user-defined start and end constraints.

• vidu/start-end-to-video-q2-turbo

A fast start-end video model built for rapid iteration and preview. It preserves subject integrity and core visual coherence while reducing generation latency.

• vidu/start-end-to-video-q1

Improves motion smoothness and narrative continuity, producing more natural transitions between poses, camera positions, and scene states.

• vidu/start-end-to-video-2.0

Synthesizes smooth motion between user-defined start and end frames while respecting scene geometry and composition. Ideal for transitions, reveals, and structured motion design.

• vidu/start-end-to-video

A compact baseline model for simple start-end interpolation and quick previews. Suitable for basic transitions, animatics, and fast storyboard development.

Image Models

• vidu/text-to-image-q2

A high-resolution cinematic text-to-image model for generating polished hero images, thumbnails, posters, and key visuals directly from prompts.

• vidu/reference-to-image-q2

A reference-guided image generation model that supports up to seven input images plus a text prompt to create new high-resolution visuals while preserving subject identity and composition.

Vidu Models

All models

vidu/q3-ad

vidu/q3/drama-clip

vidu/q3/drama

vidu/q3/image-to-video

vidu/q3/text-to-video

vidu/q3-turbo/image-to-video

vidu/q3-pro/image-to-video

vidu/q3/image-to-video-pro

vidu/q3/reference-to-video

vidu/q3/start-end-to-video

vidu/q3-turbo/start-end-to-video

vidu/q3-pro/start-end-to-video

vidu/q3-pro/text-to-video

vidu/image-to-video-2.0

vidu/reference-to-video-2.0

vidu/start-end-to-video-2.0

vidu/image-to-video

vidu/text-to-video

vidu/start-end-to-video

vidu/image-to-video-q2-pro

vidu/image-to-video-q2-turbo

vidu/text-to-video-2.0

vidu/one-click-v2/mv

vidu/q2-pro/image-to-video-fast

vidu/q2-pro/start-end-to-video-fast

vidu/reference-to-image-q2

vidu/text-to-image-q2

vidu/text-to-video-q2

vidu/template/halloween

vidu/reference-to-video-q2

vidu/q2-turbo/extend-video

vidu/q2-pro/extend-video

vidu/start-end-to-video-q2-turbo

vidu/start-end-to-video-q2-pro

vidu/text-to-video-q1

vidu/image-to-video-q1

vidu/start-end-to-video-q1

vidu/reference-to-video-q1

Vidu Models

Special Models

• vidu/one-click-v2/mv

• vidu/drama

• vidu/ad

Image-to-Video Models

• vidu/q3/image-to-video

• vidu/q2-pro/image-to-video-fast

• vidu/image-to-video-q2-pro

• vidu/image-to-video-q2-turbo

• vidu/image-to-video-q1

• vidu/image-to-video-2.0

• vidu/image-to-video

Text-to-Video Models

• vidu/q3/text-to-video

• vidu/text-to-video-q2

• vidu/text-to-video-q1

• vidu/text-to-video-2.0

• vidu/text-to-video

Reference-to-Video Models

• vidu/reference-to-video-q2

• vidu/reference-to-video-q1

• vidu/reference-to-video-2.0

Start-End Frame Video Models

• vidu/q2-pro/start-end-to-video-fast

• vidu/start-end-to-video-q2-pro

• vidu/start-end-to-video-q2-turbo

• vidu/start-end-to-video-q1

• vidu/start-end-to-video-2.0

• vidu/start-end-to-video

Image Models

• vidu/text-to-image-q2

• vidu/reference-to-image-q2

Vidu Models API — pricing & performance

Why run Vidu Models on WaveSpeedAI

Transparent pricing

Optimized for low latency

99.9% uptime

Frequently asked questions

Explore 1,000+ AI Models