
reference-to-video
Idle
Your request will cost $0.5 per run.
For $10 you can run this model approximately 20 times.
One more thing::
WAN 2.6 Reference-to-Video is Alibaba’s WanXiang 2.6 model for turning example videos + a text prompt into new shots. Provide up to two reference clips and the model learns their style, motion, and framing, then generates a new 5–10s video at up to 1080p.
Output format: MP4 video at the selected size and duration.
prompt* Text description of the new scene: characters, actions, environment, camera motion, mood, style, etc.
videos* 1–2 reference clips (URLs or uploads). These guide style, camera work, pacing, and motion structure.
negative_prompt Things to avoid, e.g. watermark, text, distortion, extra limbs.
audio (optional) External audio track for advanced pipelines where timing should loosely follow a given soundtrack. For most use cases you can leave this empty.
size One of the following resolution presets:
duration Video length: 5 s or 10 s.
shot_type
enable_prompt_expansion If enabled, Alibaba’s prompt optimizer expands short prompts into a richer internal script before generation.
seed Random seed. Set -1 for a new random result each time, or fix to a specific integer for reproducible layout and motion.
| Resolution | Sizes (W×H) | 5 s | 10 s |
|---|---|---|---|
| 720p | 1280×720 / 720×1280 | $1.00 | $1.50 |
| 1080p | 1920×1080 / 1080×1920 | $1.50 | $2.25 |
Prepare 1–2 reference videos
Write your prompt
(Optional) Add a negative_prompt
Choose size and duration
Configure multishots & prompt expansion
Set seed (optional)
Run the model and download the generated clip.
Keep reference content and prompt aligned – if references show a city night scene, avoid asking for a sunny beach.
Use two references when you want to mix:
Mention where you want the model to follow reference closely, e.g.: “Follow reference camera speed and angles, but change character outfit to futuristic armor.”
For portrait/vertical social content, select 480×832, 720×1280, or 1080×1920; for YouTube-style landscape, use the corresponding wide resolutions.
vidu/reference-to-video-q2 Vidu’s Q2 reference-to-video model for turning style and motion from example clips into new shots, ideal for anime-style edits, trailers, and storyboards.
google/veo3.1/reference-to-video Google Veo 3.1 reference-conditioned video generator, designed for high-fidelity cinematic motion that closely follows your reference footage.
kwaivgi/kling-video-o1/reference-to-video Kwaivgi’s Kling Video O1 reference-to-video model, great for copying camera language and pacing from a sample clip while changing characters or scenes.
bytedance/seedance-v1-lite/reference-to-video ByteDance SeeDance v1 Lite, a lightweight reference-to-video model for fast, style-consistent generations based on short example videos.