Browse ModelsViduVidu Reference To Video 2.0

Vidu Reference To Video 2.0

Vidu Reference To Video 2.0

Playground

Try it on WavespeedAI!

Vidu Reference-to-Video 2.0 turns references into videos that preserve characters, objects, and environments with Multi-Entity Consistency. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

Vidu Reference-to-Video 2.0 — vidu/reference-to-video-2.0

Vidu Reference-to-Video 2.0 generates a short video from a text prompt while using multiple reference images to guide subject identity, style, and scene consistency. Upload one or more reference images, describe the action and camera intent in the prompt, and the model synthesizes a coherent clip that follows your references. Movement intensity can be adjusted with movement_amplitude, and seed can be fixed for repeatable results.

Key capabilities

  • Prompt-driven video generation guided by reference images
  • Supports multiple reference images to keep identity/style consistent
  • Movement amplitude control: auto / small / medium / large
  • Seed control for reproducible generations
  • Good for “merge two references into one scene” style storytelling

Use cases

  • Character + scene blending (e.g., a person from one reference enters a room from another)
  • Style-consistent short clips based on an artwork reference
  • Multi-reference continuity across a mini story sequence
  • Product storytelling using a reference setup and a subject reference
  • Quick concept videos for ads, trailers, and social

Pricing

DurationPrice per video
5s$0.20

Inputs

  • images (required): one or more reference images (add multiple items)
  • prompt (required): action + scene + camera direction

Parameters

  • aspect_ratio: output aspect ratio (e.g., 16:9)
  • movement_amplitude: motion intensity (auto, small, medium, large)
  • seed: random seed (set a number for reproducible results)

Prompting guide (multi-reference)

When you provide multiple references, explicitly assign what each reference is used for:

Template: Use reference image 1 for the room and lighting. Use reference image 2 for the character’s appearance and clothing. The character steps out of the painting into the room, walks to the table, and places the coffee cup down. Smooth motion, consistent style, fixed camera, no flicker.

Example prompts

  • Use reference 1 as the room scene and table setup. Use reference 2 for the girl’s identity and painting style. The girl steps out of the painting into the room, walks to the table, and gently places the coffee cup down. Warm morning light, cinematic, smooth transition.
  • Combine both references into one coherent scene. The character crosses the room, interacts with the cup, subtle cloth movement, soft shadows, realistic contact with the table surface.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/vidu/reference-to-video-2.0" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "aspect_ratio": "16:9",
    "movement_amplitude": "auto",
    "seed": 0
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
imagesarrayYes[]1 ~ 3 itemsThe model will use the provided images as references to generate a video with consistent subjects. For fields that accept images: Accepts 1 to 3 images; Images Assets can be provided via URLs or Base64 encode; You must use one of the following codecs: PNG, JPEG, JPG, WebP; The dimensions of the images must be at least 128*128 pixels; The aspect ratio of the images must be less than 1:4 or 4:1; All images are limited to 50MB; The length of the base64 decode must be under 50MB, and it must include an appropriate content type string.
promptstringYes-The positive prompt for the generation.
aspect_ratiostringNo16:916:9, 9:16, 1:1The aspect ratio of the generated media.
movement_amplitudestringNoautoauto, small, medium, largeThe movement amplitude of objects in the frame. Defaults to auto, accepted value: auto, small, medium, large.
seedintegerNo--1 ~ 2147483647The random seed to use for the generation.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Request Parameters

ParameterTypeRequiredDefaultDescription
idstringYes-Task ID

Result Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
dataobjectThe prediction data object containing all details
data.idstringUnique identifier for the prediction, the ID of the prediction to get
data.modelstringModel ID used for the prediction
data.outputsstringArray of URLs to the generated content (empty when status is not completed).
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds
© 2025 WaveSpeedAI. All rights reserved.