Alibaba Wan 2.6 Image To Video
Playground
Try it on WavespeedAI!Alibaba WAN 2.6 converts text or images into videos (720p/1080p) with synced audio, faster and more affordable than Google Veo3. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.
Features
Alibaba / WAN 2.6 — Image-to-Video (wan2.6-i2v)
WAN 2.6 Image-to-Video is Alibaba’s latest WanXiang 2.6 image-to-video model. Give it a single image plus a prompt and it generates a 5–15s cinematic clip, with support for multi-shot storytelling and up to 1080p resolution.
🚀 Highlights
- Multi-shot narrative support – When prompt expansion + multi-shot are enabled, WAN 2.6 can automatically split your idea into several shots and keep key details consistent across them.
- Longer clips – Generate videos up to 15 seconds, giving more room for story arcs, transitions, and character actions.
- Flexible resolutions – Three quality tiers: 720p, 1080p, matching Alibaba’s official 2.6 spec.
- Image-driven look – Uses your input frame as the visual anchor, then animates it according to your prompt.
- Prompt-aware framing – The model balances your reference image and text description to keep identities, outfits, and overall scene coherent.
🧩 Parameters
-
image* – Required. The keyframe or base image to animate (URL or upload).
-
audio (optional) – Reserved field; can be used for advanced workflows that align motion with an external audio track. For normal use you can leave this empty.
-
prompt* – Describe the motion, story beats, camera moves, and style.
-
negative_prompt – Things to avoid (e.g. “watermark, text, distortion, extra limbs”).
-
resolution – One of:
- 720p
- 1080p
-
duration – One of 5s, 10s, 15s.
-
shot_type –
- single → single-shot clip.
- multi → when prompt expansion is on, the model can break your prompt into multiple shots for a richer narrative.
-
enable_prompt_expansion – If enabled, WAN 2.6 will expand shorter prompts into a more detailed internal script before generating.
-
seed – Fix for reproducible results; set to -1 for random, or any integer to lock the layout and motion pattern.
Output: an MP4 video at the chosen resolution tier.
💰 Pricing
| Resolution | 5 s | 10 s | 15 s |
|---|---|---|---|
| 720p | $0.50 | $1.00 | $1.50 |
| 1080p | $0.75 | $1.50 | $2.25 |
- 720p → $0.10 / s
- 1080p → $0.15 / s
✅ How to Use
-
Upload your image under image (clear subject, good lighting works best).
-
Write a prompt describing:
- what moves (character, camera, environment),
- overall mood and style (e.g., “cinematic, soft lighting, shallow depth of field”).
-
(Optional) Turn on enable_prompt_expansion if your prompt is short and you want the model to elaborate it.
-
(Optional) Enable multishots to let WAN 2.6 build a multi-shot sequence instead of a single continuous shot.
-
Choose resolution (720p / 1080p) and duration (5 / 10 / 15 s).
-
Set seed if you want repeatable results, otherwise leave -1 for variation.
-
Click Run and download your clip once it finishes.
💡 Prompt Tips
- Start with the image content, then add motion: “Camera slowly dolly-in, character turns to look at the city, neon lights flicker, light rain, cinematic grade.”
- For multi-shot stories, hint at structure: “Shot 1: wide city skyline at night; Shot 2: medium shot of the hero on the rooftop; Shot 3: close-up as they smile.”
- Keep negative prompts short and focused; don’t overload them with long prose.
More Models to Try
-
kwaivgi/kling-video-o1/image-to-video High-quality AI image-to-video generator from Kwaivgi, ideal for cinematic character shots, smooth camera motion, and social-ready short clips.
-
alibaba/wan-2.5/image-to-video Alibaba’s WAN 2.5 image-to-video model, designed for fast, coherent animation of still images into ads, product demos, and story-style videos.
-
openai/sora-2/image-to-video OpenAI Sora 2, a cutting-edge AI video generator that turns images into long, detailed, physics-aware scenes for filmic concepts and high-end content.
-
google/veo3.1/image-to-video Google Veo 3.1 image-to-video, optimized for crisp, cinematic motion and clean compositions, perfect for marketing visuals, trailers, and creative storytelling.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/alibaba/wan-2.6/image-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"resolution": "720p",
"duration": 5,
"shot_type": "single",
"enable_prompt_expansion": false,
"seed": -1
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| image | string | Yes | - | The image for generating the output. | |
| audio | string | No | - | - | Audio URL to guide generation (optional). |
| prompt | string | Yes | - | The positive prompt for the generation. | |
| negative_prompt | string | No | - | The negative prompt for the generation. | |
| resolution | string | No | 720p | 720p, 1080p | The resolution of the generated media. |
| duration | integer | No | 5 | 5, 10, 15 | The duration of the generated media in seconds. |
| shot_type | string | No | single | single, multi | The type of shots to generate. |
| enable_prompt_expansion | boolean | No | false | - | If set to true, the prompt optimizer will be enabled. |
| seed | integer | No | -1 | -1 ~ 2147483647 | The random seed to use for the generation. -1 means a random seed will be used. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | object | Array of URLs to the generated content (empty when status is not completed). |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |