Home/Explore/Vidu Models/vidu/reference-to-video-q2
image-to-video

image-to-video

Vidu Q2 Reference To Video

vidu/reference-to-video-q2

Vidu Q2 is an Image-to-Video and Reference-to-Video model that emphasizes subtle facial expressions and smooth push-pull camera moves for natural motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Hint: You can drag and drop a file or click to upload

preview

Hint: You can drag and drop a file or click to upload

preview

Idle

Your request will cost $0.25 per run.

For $10 you can run this model approximately 40 times.

One more thing::

ExamplesView all

README

Vidu Q2 — Reference-to-Video Model

Vidu Q2 is Shengshu Technology’s new-generation reference-to-video model designed to transform one or multiple input images into expressive, cinematic videos. It excels at producing subtle facial motion, natural body dynamics, and camera-aware storytelling with a strong sense of realism.

🎬 What It Does

Vidu Q2 synthesizes short videos from one or several reference images guided by a text prompt. It’s ideal for turning still portraits or concept images into smooth motion clips — suitable for both creative storytelling and professional visual production.

✨ Key Features

  • Smooth motion realism Subtle micro-expressions, eye movements, and breathing motions are reproduced authentically.
  • Cinematic camera dynamics Built-in control of push/pull, pan, tilt, and zoom effects for scene depth and emotional tone.
  • Multiple-image reference support Upload up to 7 reference images to guide pose, lighting, or perspective transitions.
  • Flexible composition Choose from aspect ratios (16:9, 9:16, 4:3, 3:4, 1:1) for any platform.
  • Motion amplitude control Select auto / small / medium / large to define the strength and style of movement.
  • High fidelity output Consistent lighting, identity preservation, and accurate reference adherence even across complex motions.

🧩 Designed For

  • Filmmakers & Storytellers: Bring still characters or concept art to life with controlled, cinematic motion.
  • Advertising Creators: Generate short motion ads with precise control over composition and intensity.
  • Artists & Illustrators: Animate hand-drawn or AI-generated portraits into dynamic living forms.
  • Game & Animation Studios: Prototype visual narratives quickly using character or environment references.

⚙️ Parameters

ParameterDescription
promptDescribe the scene, action, or mood.
imagesUpload up to 7 reference images.
aspect_ratioChoose between 16:9, 9:16, 4:3, 3:4, 1:1.
resolution360p / 540p / 720p / 1080p.
movement_amplitudeauto / small / medium / large (defines motion intensity).
durationUp to 10 seconds.
seedOptional, for reproducible results.

💰 Pricing

ResolutionDurationPricing
540p1s$0.075
540p2s$0.10
540p3s$0.125
540p4s$0.15
540p5s$0.175
540p6s$0.20
540p7s$0.225
540p8s$0.25
540p9s$0.35
540p10s$0.45
720p1s$0.125
720p2s$0.15
720p3s$0.175
720p4s$0.20
720p5s$0.225
720p6s$0.25
720p7s$0.275
720p8s$0.30
720p9s$0.40
720p10s$0.50
1080p1s$0.375
1080p2s$0.425
1080p3s$0.475
1080p4s$0.525
1080p5s$0.575
1080p6s$0.625
1080p7s$0.675
1080p8s$0.725
1080p9s$0.825
1080p10s$0.925

🧠 Tips for Best Results

  • Use consistent lighting and angles among reference images for smoother transitions.
  • Write prompts that define camera motion, emotion, or scene tone clearly.
  • “auto” movement amplitude works best for portrait-style animation; use “medium” or “large” for full-body or action scenes.
  • For cinematic looks, pair 16:9 with 1080p and descriptive atmosphere prompts (e.g., “soft sunlight flickering through leaves”).

📎 Note

  • If you didn’t upload images locally, ensure the image URLs are publicly accessible. Successfully loaded images will display as thumbnails in the interface.