Wan 2.6 Reference to Video Flash | Fast Image-to-Video API

Wan 2.6 Reference-to-Video Flash

Wan 2.6 Reference-to-Video Flash is fast reference-driven video generation model. Upload up to 5 reference images and describe the scene — the model generates high-quality video that preserves character identity and appearance, with optional audio generation and multi-shot support.

Why Choose This?

Multi-reference input Upload up to 5 reference images for precise character and scene guidance.
Identity preservation Maintains character appearance and identity across generated video frames.
Audio generation Optional synchronized audio for complete video output.
Shot type control Choose between single continuous shot or multi-shot composition.
Multiple resolutions Support for 720p and 1080p in both landscape and portrait orientations.
Prompt Enhancer Built-in tool to automatically improve your video descriptions.

Parameters

Parameter	Required	Description
reference_urls	Yes	Reference images (1-5, click "+ Add Item" for multiple)
prompt	Yes	Text description of the video scene and motion
audio	No	Custom audio track (URL or upload)
negative_prompt	No	Elements to exclude from generation
size	No	Output size: 1280720, 7201280, 19201080, 10801920
duration	No	Video length: 5 or 10 seconds (default: 5)
shot_type	No	Shot composition: single, multi (default: multi)
enable_audio	No	Generate synchronized audio (default: enabled)
enable_prompt_expansion	No	Enable prompt optimizer (default: disabled)
seed	No	Random seed for reproducibility (-1 for random)

How to Use

Upload reference images — add 1-5 character or scene references.
Write your prompt — describe the scene, motion, and camera work.
Upload audio (optional) — provide a custom audio track.
Set size — choose resolution and orientation.
Set duration — 5 or 10 seconds.
Choose shot type — single for one continuous shot, multi for varied compositions.
Configure audio — enable/disable audio generation.
Run — submit and download your video.

Pricing

Pricing depends on resolution, duration, and audio settings.

Size	Duration	Audio Off	Audio On
720p	5s	$0.25	$0.50
720p	10s	$0.375	$0.75
1080p	5s	$0.40	$0.80
1080p	10s	$0.60	$1.20

Billing Rules

Resolution multiplier: 720p (1280720 / 7201280) = 1×, 1080p (19201080 / 10801920) = 1.6×
Audio multiplier: disabled = 1×, enabled = 2×

Best Use Cases

Character Animation — Generate videos that preserve character identity from reference photos.
Social Media Content — Create engaging videos featuring consistent characters.
Storytelling — Produce narrative scenes with identity-consistent characters.
Marketing & Ads — Generate promotional videos featuring specific people or characters.
Multi-shot Production — Create videos with varied camera angles and compositions.

Pro Tips

Use multiple reference images from different angles for better identity preservation.
Use "multi" shot type for more dynamic, cinematic compositions.
Disable enable_audio for faster processing when audio is not needed.
Add negative prompts to avoid common issues (e.g., "blurry, distorted").
Enable prompt expansion for automatic prompt optimization.
Use 720p for drafts and testing, 1080p for final production.

Notes

Both reference_urls and prompt are required fields.
Maximum 5 reference images per generation.
Duration options are 5 or 10 seconds only.
Ensure uploaded image and audio URLs are publicly accessible.
Seed value -1 generates a random seed each time.
If your result don't have sound, please add prompt like "Add background sound".

More Models to Try

vidu/reference-to-video-q2 - Vidu's Q2 reference-to-video model.
google/veo3.1/reference-to-video - Google Veo 3.1 reference-conditioned video generator.
kwaivgi/kling-video-o1/reference-to-video - Kwaivgi's Kling Video O1 reference-to-video model.

Wan 2.6 Reference To Video Flash API — Quick start

Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/alibaba/wan-2.6/reference-to-video-flash with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Wan 2.6 Reference To Video Flash below.

HTTP example

# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/alibaba/wan-2.6/reference-to-video-flash" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{
    "prompt": "A cinematic shot of a city at sunset, soft golden light",
    "audio": "https://example.com/your-audio.mp3",
    "negative_prompt": "blurry, low quality, distorted",
    "size": "1280*720",
    "duration": 5,
    "shot_type": "single",
    "enable_audio": true,
    "enable_prompt_expansion": false,
    "seed": -1
}'

# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# When status is "completed", read the output from data.outputs[0].

Node.js example

// npm install wavespeed
const WaveSpeed = require('wavespeed');

const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env

const result = await client.run("alibaba/wan-2.6/reference-to-video-flash", {
        "prompt": "A cinematic shot of a city at sunset, soft golden light",
        "audio": "https://example.com/your-audio.mp3",
        "negative_prompt": "blurry, low quality, distorted",
        "size": "1280*720",
        "duration": 5,
        "shot_type": "single",
        "enable_audio": true,
        "enable_prompt_expansion": false,
        "seed": -1
});

console.log(result.outputs[0]); // → URL of the generated output

Python example

# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "alibaba/wan-2.6/reference-to-video-flash",
    {
    "prompt": "A cinematic shot of a city at sunset, soft golden light",
    "audio": "https://example.com/your-audio.mp3",
    "negative_prompt": "blurry, low quality, distorted",
    "size": "1280*720",
    "duration": 5,
    "shot_type": "single",
    "enable_audio": true,
    "enable_prompt_expansion": false,
    "seed": -1
}
)

print(output["outputs"][0])  # → URL of the generated output

Wan 2.6 Reference To Video Flash API — Frequently asked questions

What is the Wan 2.6 Reference To Video Flash API?

Wan 2.6 Reference To Video Flash is a Alibaba model for video generation from images, exposed as a REST API on WaveSpeedAI. WAN 2.6 Reference-to-Video Flash turns character, prop, or scene references from images or videos into new video shots with preserved identity, style, and layout plus smooth, coherent motion. Flash version with faster generation speed. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.

How do I call the Wan 2.6 Reference To Video Flash API?

POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/alibaba/alibaba-wan-2.6-reference-to-video-flash.

How much does Wan 2.6 Reference To Video Flash cost per run?

Wan 2.6 Reference To Video Flash starts at $0.13 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.

What inputs does Wan 2.6 Reference To Video Flash accept?

Key inputs: `prompt`, `audio`, `duration`, `size`, `seed`, `negative_prompt`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/alibaba/alibaba-wan-2.6-reference-to-video-flash.

How long does Wan 2.6 Reference To Video Flash take to generate?

Average end-to-end generation time on WaveSpeedAI is around 126 seconds per request — measured across recent runs. Queue time scales with global demand; live status is visible in the prediction record.

Can I use Wan 2.6 Reference To Video Flash outputs commercially?

Commercial usage rights depend on the model's license, set by its provider (Alibaba). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.

예시전체 보기

관련 모델

README