Vidu Image To Video Q2 Pro
Playground
Try it on WavespeedAI!Vidu Q2 Pro Image to Video generates smooth transition videos between specified image.
Features
Vidu Q2 Pro
Generate a coherent shot from just two images: a start frame and an end frame. Vidu Q2 Pro infers natural, object-aware motion between them, making it perfect for scene transitions, bridging shots, and visual storytelling. No local setup required.
Why it looks great
- Bi-frame guidance: uses both start and end frames to anchor identity, layout, and lighting for the whole clip.
- Temporal continuity: minimizes flicker and “popping,” maintaining subject integrity across frames.
- Object- & human-aware motion: preserves faces, hands, and fine details while animating clothes, hair, and props.
- Layout-smart interpolation: respects foreground/background depth, occlusions, and parallax.
- Camera-path estimation: simulates subtle pans, dolly moves, and push-ins without warping.
- Natural look: balances crisp detail with cinematic smoothness—no plastic over-processing.
Use Cases
- Storyboarding & concept animation: bring static boards to life between key beats.
- Scene interpolation in long-form content: seamless bridges between shots.
- Instructional visual sequences: demonstrate change-over-time (before → after) with smooth motion.
- Film previsualization: explore transitions, blocking, and camera moves early.
How to Use
- Upload your start frame and end frame.
- Write your prompt.
- Pick duration (e.g., 2–8 s).
- Set resolution (720p, 1080p).
- (Optional) Choose add BGM or not.
- (Optional) Choose movement_amplitude to manage the amplitude of motion in generated content
- Submit the job, or use seed to reproduce your result later.
- Preview the result and download the final video.
Price Example
| Duration (s) / Model | 540p (USD) | 720p (USD) | 1080p (USD) |
|---|---|---|---|
| 1 | 0.04 | 0.075 | 0.275 |
| 2 | 0.05 | 0.125 | 0.35 |
| 3 | 0.075 | 0.175 | 0.425 |
| 4 | 0.1 | 0.225 | 0.5 |
| 5 | 0.125 | 0.275 | 0.575 |
| 6 | 0.15 | 0.325 | 0.65 |
| 7 | 0.175 | 0.375 | 0.725 |
| 8 | 0.2 | 0.425 | 0.8 |
resolution. Please follow the final deduction.
Accelerated Inference
Our accelerated inference approach leverages advanced optimization technology from WaveSpeedAI. This fusion technique reduces computational overhead and latency, enabling rapid generation without compromising quality. The system is tuned for large-scale workloads while keeping real-time use cases snappy and reliable. For implementation details, see our engineering blog post.
Notes
- Actual processing time depends on resolution, duration, motion settings, and current queue.
- For highly dynamic changes (big pose/layout jumps), consider shorter durations or add intermediate key frames to guide motion.
- Ensure you have rights to any images you upload; outputs inherit input content constraints.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/vidu/image-to-video-q2-pro" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"duration": 5,
"resolution": "720p",
"bgm": true,
"movement_amplitude": "auto",
"seed": -1
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | The positive prompt for the generation. | |
| image | string | Yes | - | The start image for generating the output. | |
| duration | integer | No | 5 | 1, 2, 3, 4, 5, 6, 7, 8 | The duration of the generated media in seconds. |
| resolution | string | No | 720p | 540p, 720p, 1080p | Video resolution. |
| bgm | boolean | No | true | - | The background music for generating the output. |
| movement_amplitude | string | No | auto | auto, small, medium, large | The movement amplitude of objects in the frame. Defaults to auto, accepted value: auto small medium large. |
| seed | integer | No | -1 | -1 ~ 2147483647 | The random seed to use for the generation. -1 means a random seed will be used. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |