Alibaba Wan 2.7 Reference To Video
Playground
Try it on WavespeedAI!Alibaba WAN 2.7 Reference-to-Video turns character, prop, or scene references from images or videos into new video shots with preserved identity, style, and layout plus smooth, coherent motion. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.
Features
Wan 2.7 Reference-to-Video
Wan 2.7 Reference-to-Video generates new video scenes guided by reference videos and an optional reference image, maintaining consistent characters, styles, and visual identity. Upload one or more reference videos, describe the scene you want, and the model produces a coherent, character-consistent video that brings your references into a new context.
- Need to generate from a single image? Try Wan 2.7 Image-to-Video
Why Choose This?
-
Multi-video reference support Upload multiple reference videos to combine characters or visual elements from different sources into a single new scene.
-
Character-consistent generation The model preserves the identity, appearance, and style of characters from your reference videos throughout the generated clip.
-
Optional reference image Provide an additional still image to further guide the visual composition or introduce a new element.
-
Negative prompt support Specify what you don’t want in the output for more precise scene control.
-
Prompt expansion Enable enable_prompt_expansion to let the model automatically enrich and optimize your prompt before generation.
-
Resolution options Generate at 720p or 1080p to match your delivery requirements.
Parameters
| Parameter | Required | Description |
|---|---|---|
| videos | Yes | One or more reference videos. Click Add Item to include additional videos. |
| prompt | Yes | Text description of the desired scene and action. Reference characters as “Video 1”, “Video 2” etc. |
| image | No | Optional reference image to supplement the video references. |
| negative_prompt | No | Elements to exclude from the generated video. |
| resolution | No | Output resolution: 720p (default) or 1080p. |
| aspect_ratio | No | Output aspect ratio. Default: 16:9. |
| duration | No | Clip length in seconds. Default: 5. |
| enable_prompt_expansion | No | Enable automatic prompt optimization before generation. Default: off. |
| seed | No | Random seed for reproducible results. Use -1 for a random seed. |
How to Use
- Upload your reference videos — provide one or more source videos via URL or drag-and-drop. Click Add Item to add more.
- Write your prompt — describe the new scene, referencing characters by position (e.g., “The characters in Video 1 and Video 2 are sitting in front of the TV and playing video games together.”).
- Upload reference image (optional) — provide a still image to supplement the visual references.
- Add negative prompt (optional) — specify elements you want to exclude from the output.
- Select resolution — 720p for standard output, 1080p for higher-quality results.
- Select aspect ratio — choose the format that fits your target platform.
- Set duration — choose your desired clip length in seconds.
- Enable prompt expansion (optional) — let the model automatically enrich your prompt before generation.
- Set seed (optional) — fix the seed to reproduce a specific result in future runs.
- Submit — generate, preview, and download your video.
Pricing
| Duration | 720p | 1080p |
|---|---|---|
| 5s | $1.00 | $1.60 |
| 10s | $1.50 | $2.40 |
| 15s | $2.00 | $3.20 |
Billing Rules
- 720p: base rate + fixed reference processing cost
- 1080p: 1.6× the 720p cost
- Pricing includes a fixed overhead for reference video processing in addition to the selected duration
Best Use Cases
- Character-Driven Storytelling — Place characters from multiple reference videos into entirely new scenarios.
- Fan Content & IP Crossovers — Combine characters from different sources into a single coherent scene.
- Marketing & Brand Video — Generate new scenes featuring consistent brand characters or spokespeople from reference footage.
- Creative Concepting — Rapidly prototype multi-character scenes for pitching and storyboarding.
- Social Media Content — Create novel, character-consistent short-form video from existing footage.
Pro Tips
- Use “Video 1”, “Video 2” etc. in your prompt to refer to specific reference videos in order.
- The more distinct and clear each reference video is, the better the character consistency in the output.
- Use negative_prompt to prevent unintended blending of visual styles between reference videos.
- Enable prompt expansion for shorter or less detailed prompts to get richer output automatically.
- Start with 720p to test your scene composition before committing to a 1080p final render.
Notes
- Both videos and prompt are required fields; all other parameters are optional.
- Ensure video and image URLs are publicly accessible if using links rather than direct uploads.
- Please ensure your content complies with Alibaba’s usage policies.
Related Models
- Wan 2.7 Image-to-Video — Animate a single reference image into a cinematic video clip.
- Wan 2.7 Text-to-Video — Generate video from text prompts without reference footage.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/alibaba/wan-2.7/reference-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"resolution": "720p",
"aspect_ratio": "16:9",
"duration": 5,
"enable_prompt_expansion": false,
"seed": -1
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | The positive prompt for the generation. | |
| image | string | No | - | URL to a single reference image. | |
| videos | array | Yes | - | 1 ~ 5 items | Array of URLs to reference videos (max 4). |
| negative_prompt | string | No | - | The negative prompt for the generation. | |
| resolution | string | No | 720p | 720p, 1080p | The resolution of the generated video. |
| aspect_ratio | string | No | 16:9 | 16:9, 9:16, 1:1, 4:3, 3:4 | The aspect ratio of the generated video. |
| duration | integer | No | 5 | 2 ~ 10 | The duration of the generated media in seconds (2-10s). |
| enable_prompt_expansion | boolean | No | false | - | If set to true, the prompt optimizer will be enabled. |
| seed | integer | No | -1 | -1 ~ 2147483647 | The random seed to use for the generation. -1 means a random seed will be used. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | object | Array of URLs to the generated content (empty when status is not completed). |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |