Vidu Reference To Video Q2

Playground

Vidu Q2 is an Image-to-Video and Reference-to-Video model that emphasizes subtle facial expressions and smooth push-pull camera moves for natural motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

Vidu Q2 Reference-to-Video

Vidu Q2 Reference-to-Video transforms one or multiple input images into expressive, cinematic videos. It excels at producing subtle facial motion, natural body dynamics, and camera-aware storytelling — ideal for turning still portraits or concept images into smooth motion clips.

Why Choose This?

Smooth motion realism Subtle micro-expressions, eye movements, and breathing motions reproduced authentically.
Cinematic camera dynamics Built-in control of push/pull, pan, tilt, and zoom effects for scene depth and emotional tone.
Multiple-image reference support Upload up to 7 reference images to guide pose, lighting, or perspective transitions.
Flexible composition Choose from multiple aspect ratios (16:9, 9:16, 4:3, 3:4, 1:1) for any platform.
Motion amplitude control Select auto, small, medium, or large to define the strength and style of movement.
High fidelity output Consistent lighting, identity preservation, and accurate reference adherence.

Parameters

Parameter	Required	Description
prompt	Yes	Describe the scene, action, or mood
images	Yes	Reference images (up to 7 images)
aspect_ratio	No	Aspect ratio: 16:9, 9:16, 4:3, 3:4, or 1:1
resolution	No	Output resolution: 540p, 720p, or 1080p
duration	No	Video length in seconds (1–10)
movement_amplitude	No	Motion intensity: auto, small, medium, or large
seed	No	Random seed for reproducibility (-1 for random)

How to Use

Upload reference images — add up to 7 images to guide the generation.
Write your prompt — describe the scene, action, camera motion, or mood.
Choose aspect ratio — select based on your target platform.
Set resolution — 540p, 720p, or 1080p based on quality needs.
Set duration — choose video length from 1 to 10 seconds.
Adjust movement amplitude — auto for portraits, medium/large for action.
Run — submit and download your video.

Pricing

Resolution	Duration	Price
540p	1s	$0.075
540p	2s	$0.10
540p	3s	$0.125
540p	4s	$0.15
540p	5s	$0.175
540p	6s	$0.20
540p	7s	$0.225
540p	8s	$0.25
540p	9s	$0.35
540p	10s	$0.45
720p	1s	$0.125
720p	2s	$0.15
720p	3s	$0.175
720p	4s	$0.20
720p	5s	$0.225
720p	6s	$0.25
720p	7s	$0.275
720p	8s	$0.30
720p	9s	$0.40
720p	10s	$0.50
1080p	1s	$0.375
1080p	2s	$0.425
1080p	3s	$0.475
1080p	4s	$0.525
1080p	5s	$0.575
1080p	6s	$0.625
1080p	7s	$0.675
1080p	8s	$0.725
1080p	9s	$0.825
1080p	10s	$0.925

Billing Rules

540p: $0.075 for 1s, +$0.025/s up to 8s, then $0.35 for 9s, $0.45 for 10s

720p: $0.125 for 1s, +$0.025/s up to 8s, then $0.40 for 9s, $0.50 for 10s

1080p: $0.375 for 1s, +$0.05/s up to 8s, then $0.825 for 9s, $0.925 for 10s

Best Use Cases

Filmmakers and Storytellers — Bring still characters or concept art to life with controlled, cinematic motion.
Advertising Creators — Generate short motion ads with precise control over composition and intensity.
Artists and Illustrators — Animate hand-drawn or AI-generated portraits into dynamic living forms.
Game and Animation Studios — Prototype visual narratives quickly using character or environment references.

Pro Tips

Use consistent lighting and angles among reference images for smoother transitions.
Write prompts that define camera motion, emotion, or scene tone clearly.
“auto” movement amplitude works best for portrait-style animation.
Use “medium” or “large” amplitude for full-body or action scenes.
For cinematic looks, pair 16:9 with 1080p and descriptive atmosphere prompts.

Notes

Maximum 7 reference images per generation.
Maximum duration is 10 seconds.
If using image URLs, ensure they are publicly accessible.
Successfully loaded images will display as thumbnails in the interface.

Vidu Q2 Text-to-Video — Generate videos from text prompts only.
Vidu Q2 Pro Image-to-Video — High-quality single image to video.
Vidu Q2 Turbo Image-to-Video — Fast single image to video.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/vidu/reference-to-video-q2" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "aspect_ratio": "16:9",
    "resolution": "720p",
    "duration": 5,
    "movement_amplitude": "auto",
    "seed": 0
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
images	array	Yes	[]	1 ~ 7 items	Reference images for video generation. Requirements: 1. Accept 1-7 images; 2. Images can be URLs or Base64 encoded
prompt	string	Yes		-	The positive prompt for the generation.
aspect_ratio	string	No	16:9	16:9, 9:16, 4:3, 3:4, 1:1	The aspect ratio of the generated media.
resolution	string	No	720p	540p, 720p, 1080p	The resolution of the generated media.
duration	number	No	5	1 ~ 10	The duration of the generated media in seconds.
movement_amplitude	string	No	auto	auto, small, medium, large	The movement amplitude of objects in the frame. Defaults to auto, accepted value: auto, small, medium, large.
seed	integer	No	-	-1 ~ 2147483647	The random seed to use for the generation.

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.has_nsfw_contents	array	Array of boolean values indicating NSFW detection for each output
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	string	Array of URLs to the generated content (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Vidu Reference To Video Q1 Vidu Start End To Video

Vidu Reference To Video Q2

Playground

Features

Vidu Q2 Reference-to-Video

Why Choose This?

Parameters

How to Use

Pricing

Billing Rules

Best Use Cases

Pro Tips

Notes

Related Models

Authentication

API Endpoints

Submit Task & Query Result

Parameters

Task Submission Parameters

Request Parameters

Response Parameters

Result Request Parameters

Result Response Parameters