Alibaba Wan 2.6 Text To Video

Playground

Alibaba WAN 2.6 Text-to-Video turns plain prompts into coherent, cinematic clips with crisp detail, stable motion, and strong instruction-following—great for ads, explainers, and social posts. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

Features

Alibaba / WAN 2.6 — Text-to-Video (wan2.6-t2v)

WAN 2.6 Text-to-Video is Alibaba’s WanXiang 2.6 model that turns a pure text prompt (optionally with audio) into a 5–15s cinematic clip. It supports multi-shot storytelling, vertical or landscape formats, and resolutions up to 1080p, making it a strong fit for ads, trailers, and social content.

🚀 Highlights

Prompt-only video generation – No reference image required: describe the scene and WAN 2.6 builds the entire sequence.
Multi-shot narratives – With prompt expansion and multishots enabled, the model can split your idea into several shots while preserving key characters and style.
5–15 second clips – Enough room for intros, reveals, and full micro-stories.
Flexible sizes – Horizontal and vertical presets across ** 720p / 1080p** tiers.
Prompt-aware consistency – Keeps identities, outfits, and scene semantics coherent across the whole clip.

🧩 Parameters

prompt* – Main description of the video: scene, characters, motion, camera moves, style.
negative_prompt – Things to avoid (e.g. watermark, text, distortion, extra limbs).
audio (optional) – URL or file of an audio track; reserved for advanced workflows where you want to align motion with existing sound.
size – Resolution presets:
- 720p tier
  - 1280×720 (landscape)
  - 720×1280 (vertical)
- 1080p tier
  - 1920×1080 (landscape)
  - 1080×1920 (vertical)
duration – One of 5s, 10s, 15s.
shot_type –
- single → single continuous shot.
- multi → when combined with enable_prompt_expansion, lets the model create a multi-shot sequence.
enable_prompt_expansion – If enabled, WAN 2.6 first expands your prompt into an internal, more detailed script before generating.
seed – Random seed; set to -1 for different results each time or use a fixed integer for reproducible motion/layout.

Output: an MP4 video at the chosen resolution and orientation.

💰 Pricing

Pricing depends on duration and resolution tier:

Resolution	5 s	10 s	15 s
720p	$0.50	$1.00	$1.50
1080p	$0.75	$1.50	$2.25

✅ How to Use

Write your prompt – Describe what happens, who appears, how the camera moves, and the visual style.
(Optional) Add a negative_prompt to suppress artifacts or unwanted elements.
(Optional) Provide an audio track if your workflow requires it.
Choose a size (one of the 720p / 1080p presets, landscape or vertical).
Set duration to 5 / 10 / 15 seconds.
Enable prompt_expansion and multishots if you want richer, multi-shot storytelling.
Set a seed (or leave -1 for variation) and click Run to generate your clip.

💡 Prompt Tips

Start with a clear setting + subject + action: “Cyberpunk city street at night, rain on the ground, a lone biker rides through neon fog, cinematic camera tracking shot.”
For multi-shot stories, hint at structure: “Shot 1: wide city skyline at dawn; Shot 2: hero walks across rooftop; Shot 3: close-up as they put on helmet.”
Keep negative prompts short and focused (e.g. blurry, watermark, extra limbs) instead of full sentences.
Match size to platform: vertical (720×1280 / 1080×1920) for Shorts/Reels/TikTok, landscape for YouTube and web.

More Models to Try

kwaivgi/kling-video-o1/text-to-video Kwaivgi’s cinematic text-to-video model, great for character-driven scenes, smooth camera moves, and short-form storytelling.
alibaba/wan-2.5/text-to-video Alibaba’s WAN 2.5 prompt-to-video engine, focused on fast, coherent ads, explainers, and product demos.
google/veo3.1/text-to-video Google Veo 3.1 text-to-video, tuned for crisp compositions, filmic motion, and marketing-ready visuals.
openai/sora-2/text-to-video OpenAI Sora 2, a high-end text-to-video generator for long, detailed, physics-aware scenes and premium creative content.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/alibaba/wan-2.6/text-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "size": "1280*720",
    "duration": 5,
    "shot_type": "single",
    "enable_prompt_expansion": false,
    "seed": -1
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
prompt	string	Yes		-	The positive prompt for the generation.
negative_prompt	string	No		-	The negative prompt for the generation.
audio	string	No	-	-	Audio URL to guide generation (optional).
size	string	No	1280*720	1280720, 7201280, 19201080, 10801920	The size of the generated media in pixels (width*height).
duration	integer	No	5	5, 10, 15	The duration of the generated media in seconds.
shot_type	string	No	single	single, multi	The type of shots to generate.
enable_prompt_expansion	boolean	No	false	-	If set to true, the prompt optimizer will be enabled.
seed	integer	No	-1	-1 ~ 2147483647	The random seed to use for the generation. -1 means a random seed will be used.

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.has_nsfw_contents	array	Array of boolean values indicating NSFW detection for each output
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	object	Array of URLs to the generated content (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Alibaba Wan 2.6 Text To Image Alibaba Wan 2.6 Video Extend