Alibaba·video·From $0.030/run

Wan 2.6 API

Alibaba WAN 2.6 — text-to-video and image-to-video with synced audio at 720p/1080p, plus reference-to-video, video-extend, image-edit, and text-to-image in the same family. Flash and Spicy tiers for speed and scalable content generation.

Text-to-video and image-to-video at 720p/1080p with synced audio. Image-to-video Flash for speed-optimized generation; Spicy for unlimited scalable content. Reference-to-video preserves identity from character, prop, or scene references. Video-extend supports preserved or generated synchronized audio.

Open Playground →View API Docs

About the Wan 2.6 API

What Wan 2.6 does, how it fits in the Alibaba model lineup, and why teams reach for it.

Wan 2.6 is a video generation model from Alibaba, available through the WaveSpeedAI REST API. Alibaba WAN 2.6 — text-to-video and image-to-video with synced audio at 720p/1080p, plus reference-to-video, video-extend, image-edit, and text-to-image in the same family. Flash and Spicy tiers for speed and scalable content generation.

Text-to-video and image-to-video at 720p/1080p with synced audio. Image-to-video Flash for speed-optimized generation; Spicy for unlimited scalable content. Reference-to-video preserves identity from character, prop, or scene references. Video-extend supports preserved or generated synchronized audio.

The Wan 2.6 family on WaveSpeedAI ships 10 REST endpoints covering Text-To-Image, Image-To-Video, Video-Extend, Image-To-Image, Text-To-Video workflows. Each variant carries its own pricing, parameter knobs, and example outputs — pick the one that matches your input modality and production constraints, or call several from the same API key to compose multi-step pipelines.

Run Wan 2.6 through the same API key, billing account, and rate-limit envelope you use for the other 1,000+ AI models on WaveSpeedAI. No separate vendor setup, no per-provider SDKs, no per-vendor rate-limit envelopes — one integration covers everything from text-to-image and text-to-video through audio synthesis, 3D generation, upscaling, and editing.

All Wan 2.6 API endpoints

10 Wan 2.6 endpoints available now on WaveSpeedAI — pick the variant that matches your workflow.

Wan 2.6 Text To Image

WAN 2.6 Text-to-Image generates high-quality images from natural-language prompts with strong prompt adherence and clean composition. It supports multiple aspect ratios and size control, seed-based reproducibility, and flexible styles (photorealistic to illustrative) for ads, product shots, and social visuals. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing.

text-to-imagefrom $0.030

Wan 2.6 Image To Video Pro

WAN 2.6 Image-to-Video Pro converts images into premium-quality videos with superior motion dynamics, enhanced visual fidelity, and professional cinematic output. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.60

Wan 2.6 Video Extend

WAN 2.6 Video-Extend turns short clips into longer videos with preserved or generated synchronized audio for continuity. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

video-extendfrom $0.25

Wan 2.6 Reference To Video

WAN 2.6 Reference-to-Video turns character, prop, or scene references—single or multi-view—into new video shots with preserved identity, style, and layout plus smooth, coherent motion. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

image-to-videofrom $0.50

Wan 2.6 Image Edit

WAN 2.6 Image-Edit turns prompts into precise photo edits—adjusting color and lighting, restyling aesthetics, replacing backgrounds, removing objects, and refining details while preserving subject identity. Built for stable, repeatable image-to-image pipelines. Ready-to-use REST API, best performance, no cold starts, affordable pricing.

image-to-imagefrom $0.035

Wan 2.6 Text To Video

WAN 2.6 Text-to-Video turns plain prompts into coherent, cinematic clips with crisp detail, stable motion, and strong instruction-following—great for ads, explainers, and social posts. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

text-to-videofrom $0.50

Wan 2.6 Image To Video Flash

WAN 2.6 Flash converts images into videos (720p/1080p) with optional audio, optimized for speed and cost. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.13

Wan 2.6 Reference To Video Flash

WAN 2.6 Reference-to-Video Flash turns character, prop, or scene references from images or videos into new video shots with preserved identity, style, and layout plus smooth, coherent motion. Flash version with faster generation speed. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-videofrom $0.13

Wan 2.6 Image To Video Spicy

WAN 2.6 Spicy converts images into unlimited high-quality videos with smooth animations optimized for scalable content generation. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

image-to-videofrom $0.50

Wan 2.6 Image To Video

WAN 2.6 converts text or images into videos (720p/1080p) with synced audio, faster and more affordable than Google Veo3. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

image-to-videofrom $0.50

See Wan 2.6 in action

Real outputs generated by the Wan 2.6 API. Hover any video to preview, click to open the full-size viewer.

How to use the Wan 2.6 API

Four steps from signup to a finished generation. Full Python, Node.js, and cURL examples are in the API section below.

1
Get an API key
Sign up for a WaveSpeedAI account and copy your API key from the dashboard. New accounts come with free starter credits — enough to run the playground a few dozen times before billing kicks in.
2
Submit a prediction
POST your input as JSON to https://api.wavespeed.ai/api/v3/alibaba/wan-2.6/text-to-video. The endpoint returns a prediction id immediately — generations are async so you don't hold an open connection during inference.
3
Poll for completion
GET https://api.wavespeed.ai/api/v3/predictions/{request_id}/result. Start around every 2 seconds, then increase toward 5-10 seconds for long-running tasks to reduce unnecessary requests. Stop on completed, failed, cancelled, or timeout.
4
Read the output URL
Once status is"completed", read the URL from data.outputs[0]. The URL points to your generated media on the WaveSpeedAI CDN — image, video, audio, or 3D file depending on the Wan 2.6 variant you called.

What you can build with Wan 2.6

Common workflows developers and creators use the Wan 2.6 API for.

Text-to-video with synced audio

alibaba/wan-2.6/text-to-video turns plain prompts into coherent cinematic clips with crisp detail, stable motion, and synced audio — catalog framing: great for ads, explainers, and social posts.

text-to-videoaudiocinematic

Image-to-video at 720p/1080p

alibaba/wan-2.6/image-to-video converts images into videos at 720p/1080p with synced audio — faster and more affordable than Google Veo3 per the catalog positioning.

image-to-video1080paudio

Reference-to-video for identity

alibaba/wan-2.6/reference-to-video turns character, prop, or scene references (single or multi-view) into new video shots with preserved identity, style, and layout plus smooth coherent motion.

referenceidentitymulti-view

Video-extend with audio continuity

alibaba/wan-2.6/video-extend turns short clips into longer videos with preserved or generated synchronized audio for continuity — useful for building extended sequences from shorter source clips.

video-extendaudiolong-form

Image-edit in the same family

alibaba/wan-2.6/image-edit performs precise photo edits — adjusting color and lighting, restyling aesthetics, replacing backgrounds, removing objects — while preserving subject identity.

image-editrestylepipeline

Flash and Spicy tiers

image-to-video-flash optimizes for speed and cost; image-to-video-spicy generates unlimited high-quality videos optimized for scalable content generation — pick Flash for iteration, Spicy for high-volume i2v pipelines.

flashspicytiers

Tips for prompting Wan 2.6

Practical advice for getting better outputs from Wan 2.6 — drawn from the patterns that work across video models in production pipelines.

Synced audio — describe sound in the prompt

Wan 2.6 ships synced audio on text-to-video and image-to-video. Describe ambient sound, dialogue mood, or music genre alongside the visual prompt.

Flash tier for iteration, Standard for delivery

image-to-video-flash optimizes for speed and cost; standard image-to-video and text-to-video for delivery-grade output with synced audio.

Reference-to-video for identity

Use reference-to-video when character, prop, or scene identity must carry from reference images to the generated clip — single or multi-view refs supported.

Video-extend with audio continuity

video-extend preserves or generates synchronized audio across the extension — useful when building longer clips from short source material.

Spicy for high-volume i2v

image-to-video-spicy is positioned for unlimited scalable content generation — pick for throughput-friendly stylized i2v pipelines.

Wan 2.6 API pricing

Pricing is per-output. The final charge scales with the parameters you set in each variant's playground (resolution, duration, output count, references).

Endpoint	Type	Starting price
alibaba/wan-2.6/text-to-image	text-to-image	$0.030
alibaba/wan-2.6/image-to-video-pro	image-to-video	$0.60
alibaba/wan-2.6/video-extend	video-extend	$0.25
alibaba/wan-2.6/reference-to-video	image-to-video	$0.50
alibaba/wan-2.6/image-edit	image-to-image	$0.035
alibaba/wan-2.6/text-to-video	text-to-video	$0.50
alibaba/wan-2.6/image-to-video-flash	image-to-video	$0.13
alibaba/wan-2.6/reference-to-video-flash	image-to-video	$0.13
alibaba/wan-2.6/image-to-video-spicy	image-to-video	$0.50
alibaba/wan-2.6/image-to-video	image-to-video	$0.50

Call the Wan 2.6 API

Sign up for an API key at wavespeed.ai/accesskey, then submit a prediction via REST. The playground generates ready-to-paste samples for any combination of inputs.

HTTP example

# 1. Submit the prediction.
SUBMIT_RESPONSE=$(curl --silent --show-error --fail-with-body \
  -X POST "https://api.wavespeed.ai/api/v3/alibaba/wan-2.6/text-to-video" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{}')

TASK=$(printf '%s' "$SUBMIT_RESPONSE" | jq 'if has("data") then .data else . end')
PREDICTION_ID=$(printf '%s' "$TASK" | jq -r '.id')
if [ -z "$PREDICTION_ID" ] || [ "$PREDICTION_ID" = "null" ]; then
  printf 'Submission response did not contain a prediction id
' >&2
  exit 1
fi
RESULT_URL=$(printf '%s' "$TASK" | jq -r '.urls.get // empty')
if [ -z "$RESULT_URL" ]; then
  RESULT_URL="https://api.wavespeed.ai/api/v3/predictions/$PREDICTION_ID/result"
fi

# 2. Poll until the prediction finishes.
while true; do
  RESPONSE=$(curl --silent --show-error --fail-with-body "$RESULT_URL" \
    -H "Authorization: Bearer $WAVESPEED_API_KEY")
  RESULT=$(printf '%s' "$RESPONSE" | jq 'if has("data") then .data else . end')
  STATUS=$(printf '%s' "$RESULT" | jq -r '.status')
  case "$STATUS" in
    completed) printf '%s\n' "$RESULT" | jq '.outputs'; break ;;
    failed|cancelled|timeout) printf '%s\n' "$RESULT" | jq . >&2; exit 1 ;;
    created|processing) sleep 2 ;;
    *) printf 'Unexpected status: %s
' "$STATUS" >&2; exit 1 ;;
  esac
done

Node.js example

const submitUrl = "https://api.wavespeed.ai/api/v3/alibaba/wan-2.6/text-to-video";
const apiKey = process.env.WAVESPEED_API_KEY;
if (!apiKey) throw new Error('Set WAVESPEED_API_KEY');

async function requestJson(url, options = {}) {
  const response = await fetch(url, options);
  if (!response.ok) throw new Error(await response.text());
  return response.json();
}

// 1. Submit the prediction.
const body = await requestJson(submitUrl, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${apiKey}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({}),
});
const task = body.data ?? body;
const resultUrl = task.urls?.get ||
  `https://api.wavespeed.ai/api/v3/predictions/${task.id}/result`;

// 2. Poll until the prediction finishes.
while (true) {
  const resultBody = await requestJson(resultUrl, {
    headers: { "Authorization": `Bearer ${apiKey}` },
  });
  const result = resultBody.data ?? resultBody;
  if (result.status === "completed") {
    console.log(result.outputs);
    break;
  }
  if (["failed", "cancelled", "timeout"].includes(result.status)) throw new Error(JSON.stringify(result));
  if (!["created", "processing"].includes(result.status)) throw new Error("Unexpected status: " + result.status);
  await new Promise(resolve => setTimeout(resolve, 2000));
}

Python example

import json
import os
import time
from urllib.request import Request, urlopen

api_key = os.environ["WAVESPEED_API_KEY"]
headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
payload = {}

def request_json(url, data=None):
    request = Request(url, data=data, headers=headers, method="POST" if data else "GET")
    with urlopen(request) as response:
        return json.load(response)

# 1. Submit the prediction.
body = request_json("https://api.wavespeed.ai/api/v3/alibaba/wan-2.6/text-to-video", json.dumps(payload).encode())
task = body.get("data", body)
result_url = task.get("urls", {}).get("get") or f"https://api.wavespeed.ai/api/v3/predictions/{task['id']}/result"

# 2. Poll until the prediction finishes.
while True:
    result_body = request_json(result_url)
    result = result_body.get("data", result_body)
    status = result.get("status")
    if status == "completed":
        print(result.get("outputs", []))
        break
    if status in {"failed", "cancelled", "timeout"}:
        raise RuntimeError(result)
    if status not in {"created", "processing"}:
        raise RuntimeError(f"Unexpected status: {status}")
    time.sleep(2)

Wan 2.6 vs alternatives

When to pick Wan 2.6 over similar models on WaveSpeedAI.

Wan 2.6 vs Wan 2.7

Wan 2.7 is the newer architecture with image-edit-pro, text-to-image-pro with thinking mode, and first/last frame control on image-to-video. Wan 2.6 is positioned as faster and more affordable with synced audio and the Flash/Spicy tier story.

Wan 2.6 vs Wan 2.2

Wan 2.2 (WaveSpeedAI variants) ships specialized endpoints — Animate (120s), Speech-to-Video (10-min), Fun-Control (Apache 2.0), LoRA trainers. Wan 2.6 is the official Alibaba family with synced audio and cross-modal image-edit/text-to-image in one namespace.

Wan 2.6 vs Seedance 2.0

Seedance 2.0 ships native audio across every variant and the Turbo tier. Wan 2.6 adds reference-to-video, video-extend, image-edit, and text-to-image in the same family — broader cross-modal toolkit at a different price point.

Wan 2.6 API — Frequently asked questions

Pricing, license, integration — common questions about running Wan 2.6 on WaveSpeedAI.

What is the Wan 2.6 API?

Wan 2.6 is a Alibaba video generation model exposed as a REST API on WaveSpeedAI. Alibaba WAN 2.6 — text-to-video and image-to-video with synced audio at 720p/1080p, plus reference-to-video, video-extend, image-edit, and text-to-image in the same family. Flash and Spicy tiers for speed and scalable content generation. You can call it programmatically or try it from the playground linked above.

How do I call the Wan 2.6 API?

Sign up for a WaveSpeedAI account, copy your API key from /accesskey, then POST to https://api.wavespeed.ai/api/v3/alibaba/wan-2.6/text-to-video with your input as JSON. The endpoint returns a prediction id. Poll the result endpoint starting around every 2 seconds, increase the interval for long-running tasks, and stop on any terminal status. Production-oriented Python / Node.js / cURL examples are above.

How much does the Wan 2.6 API cost?

Wan 2.6 starts at $0.030 per run. The exact cost scales with the parameters you set (resolution, duration, output count, references). The live cost preview next to the Generate button in the playground shows the exact price for your current input.

Which Wan 2.6 variants are available?

WaveSpeedAI hosts 10 live Wan 2.6 endpoints: alibaba/wan-2.6/text-to-image, alibaba/wan-2.6/image-to-video-pro, alibaba/wan-2.6/video-extend, alibaba/wan-2.6/reference-to-video, alibaba/wan-2.6/image-edit, alibaba/wan-2.6/text-to-video, alibaba/wan-2.6/image-to-video-flash, alibaba/wan-2.6/reference-to-video-flash, and more. Each variant has its own playground page and pricing.

Can I use Wan 2.6 outputs commercially?

Commercial usage rights follow the Alibaba model license. Most Alibaba models permit commercial output use; see each model's playground page for the specific license summary, and WaveSpeedAI's Terms of Service for platform-level conditions.

Why use Wan 2.6 on WaveSpeedAI instead of going direct?

One API key + one billing account across Wan 2.6 AND 1,000+ other AI models from other providers. No per-vendor SDK setup, no separate rate-limit envelopes, no rewrite-per-vendor integration code. Pricing is typically at parity with or below Alibaba's direct API.

About Alibaba

The team behind Wan 2.6 and the broader Alibaba model lineup on WaveSpeedAI.

Alibaba's Tongyi Lab produces the Wan family of video models and the Qwen family of LLMs. Wan is notable for being released with open weights, broad variant coverage (text-to-video, image-to-video, reference-to-video, video-edit, video-extend, image-edit, text-to-image), and consistent strength on motion stability and prompt adherence across multilingual prompts.

Related model APIs on WaveSpeedAI

Other AI APIs from Alibaba and the rest of the video model lineup — one API key, one billing account.

Wan 2.7 API

Alibaba

Alibaba WAN 2.7 — coherent cinematic video with crisp detail, stable motion, and strong instruction-following. Separate endpoints for text-to-video, image-to-video, reference-to-video, video-edit, video-extend, plus image-edit and text-to-image variants in the same family.

Happy Horse 1.0 API

Alibaba

Alibaba Happy Horse 1.0 — cinematic 720p / 1080p video with smooth camera movement, expressive motion, and strong prompt fidelity. Includes reference-to-video for consistent character/style identity across generations.

Wan 2.2 API

Alibaba

Alibaba's Wan 2.2 — open-weight video toolkit deployed on WaveSpeedAI with 35+ first-party variants: Animate (120s character animation), Video Edit, Speech-to-Video (10-min audio-driven), Fun-Control (Apache 2.0 licensed), plus image-to-video and text-to-video at multiple model sizes (5B, A14B) and resolutions (480p / 720p).

Qwen Image API

Alibaba

Alibaba Qwen-Image — 20B MMDiT next-gen text-to-image and editing toolkit with bilingual Chinese/English support, multi-image editing, LoRA customization, layered compositing, and a 96-pose camera-angle system.

Seedance 2.5 API

ByteDance

ByteDance Seedance 2.5 API access is coming to WaveSpeedAI with planned 30-second single-shot video generation, support for up to 50 reference files, and more controllable video generation and editing. Track Seedance 2.5 release status here and test the current Seedance 2.0 API family today.

Seedance 2.0 Mini API

ByteDance

ByteDance Seedance 2.0 Mini — the faster, lower-cost tier of Seedance 2.0. Same cinematic multi-shot storytelling, AI camera control, and character consistency with native audio, at 50% of the standard price.

Start building with Wan 2.6 on WaveSpeedAI

Free starter credits on signup. One API key across 1,000+ AI models from Alibaba and every other provider.

Open Wan 2.6 Playground →Get an API Key

Wan 2.6 API

About the Wan 2.6 API

All Wan 2.6 API endpoints

Wan 2.6 Text To Image

Wan 2.6 Image To Video Pro

Wan 2.6 Video Extend

Wan 2.6 Reference To Video

Wan 2.6 Image Edit

Wan 2.6 Text To Video

Wan 2.6 Image To Video Flash

Wan 2.6 Reference To Video Flash

Wan 2.6 Image To Video Spicy

Wan 2.6 Image To Video

See Wan 2.6 in action

How to use the Wan 2.6 API

Get an API key

Submit a prediction

Poll for completion

Read the output URL

What you can build with Wan 2.6

Text-to-video with synced audio

Image-to-video at 720p/1080p

Reference-to-video for identity

Video-extend with audio continuity

Image-edit in the same family

Flash and Spicy tiers

Tips for prompting Wan 2.6

Synced audio — describe sound in the prompt

Flash tier for iteration, Standard for delivery

Reference-to-video for identity

Video-extend with audio continuity

Cross-modal: image-edit + video in one family

Spicy for high-volume i2v

Wan 2.6 API pricing

Call the Wan 2.6 API

Wan 2.6 vs alternatives

Wan 2.6 vs Wan 2.7

Wan 2.6 vs Wan 2.2

Wan 2.6 vs Seedance 2.0

Wan 2.6 API — Frequently asked questions

About Alibaba

Related model APIs on WaveSpeedAI

Wan 2.7 API

Happy Horse 1.0 API

Wan 2.2 API

Qwen Image API

Seedance 2.5 API

Seedance 2.0 Mini API

Start building with Wan 2.6 on WaveSpeedAI