Home/Explore/vidu/start-end-to-video-q2-pro

image-to-video

vidu/start-end-to-video-q2-pro

Vidu Q2 Pro Start-End to Video generates smooth transition videos between specified start and end images.

Hint: You can drag and drop a file or click to upload

Hint: You can drag and drop a file or click to upload

The background music for generating the output.

Idle

Your request will cost $0.15 per run.

For $10 you can run this model approximately 66 times.

One more thing:

ExamplesView all

README

Vidu Q2 Pro

Generate a coherent shot from just two images: a start frame and an end frame. Vidu Q2 Pro infers natural, object-aware motion between them, making it perfect for scene transitions, bridging shots, and visual storytelling. No local setup required.

Why it looks great

  • Bi-frame guidance: uses both start and end frames to anchor identity, layout, and lighting for the whole clip.
  • Temporal continuity: minimizes flicker and “popping,” maintaining subject integrity across frames.
  • Object- & human-aware motion: preserves faces, hands, and fine details while animating clothes, hair, and props.
  • Layout-smart interpolation: respects foreground/background depth, occlusions, and parallax.
  • Camera-path estimation: simulates subtle pans, dolly moves, and push-ins without warping.
  • Natural look: balances crisp detail with cinematic smoothness—no plastic over-processing.

Use Cases

  • Storyboarding & concept animation: bring static boards to life between key beats.
  • Scene interpolation in long-form content: seamless bridges between shots.
  • Instructional visual sequences: demonstrate change-over-time (before → after) with smooth motion.
  • Film previsualization: explore transitions, blocking, and camera moves early.

How to Use

  1. Upload your start frame and end frame.
  2. Write your prompt.
  3. Pick duration (e.g., 5–8 s).
  4. Set resolution (720p, 1080p).
  5. (Optional) Choose add BGM or not.
  6. (Optional) Choose movement_amplitude to manage the amplitude of motion in generated content
  7. Submit the job, or use seed to reproduce your result later.
  8. Preview the result and download the final video.

Price

DurationResolutionTotal ($)Implied $/sEquivalent per 5s ($)
5s720p0.150.030000.1500
5s1080p0.350.070000.3500
8s720p0.250.031250.1563
8s1080p0.500.062500.3125

Accelerated Inference

Our accelerated inference approach leverages advanced optimization technology from WaveSpeedAI. This fusion technique reduces computational overhead and latency, enabling rapid generation without compromising quality. The system is tuned for large-scale workloads while keeping real-time use cases snappy and reliable. For implementation details, see our engineering blog post.

Notes

  • Actual processing time depends on resolution, duration, motion settings, and current queue.
  • For highly dynamic changes (big pose/layout jumps), consider shorter durations or add intermediate key frames to guide motion.
  • Ensure you have rights to any images you upload; outputs inherit input content constraints.