Wan 2.7 API
Alibaba Tongyi Lab Wan 2.7 — full-modality input (text, image, video, and audio references combined in a single generation), open weights, native 1080p output.
About the Wan 2.7 API
What Wan 2.7 does, how it fits in the Alibaba model lineup, and why teams reach for it.
Wan 2.7 is a video generation model from Alibaba, available through the WaveSpeedAI REST API. Alibaba Tongyi Lab Wan 2.7 — full-modality input (text, image, video, and audio references combined in a single generation), open weights, native 1080p output.
The Wan 2.7 family on WaveSpeedAI ships 9 REST endpoints covering Text-To-Image, Video-Extend, Image-To-Video, Text-To-Video, Image-To-Image, Video-To-Video workflows. Each variant carries its own pricing, parameter knobs, and example outputs — pick the one that matches your input modality and production constraints, or call several from the same API key to compose multi-step pipelines.
Run Wan 2.7 through the same API key, billing account, and rate-limit envelope you use for the other 1,000+ AI models on WaveSpeedAI. No separate vendor setup, no per-provider SDKs, no per-vendor rate-limit envelopes — one integration covers everything from text-to-image and text-to-video through audio synthesis, 3D generation, upscaling, and editing.
All Wan 2.7 API endpoints
9 endpoints available now on WaveSpeedAI — pick the variant that matches your workflow.

Text To Image
WAN 2.7 Text-to-Image generates high-quality images from text prompts with thinking mode for enhanced image quality. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Text To Image Pro
WAN 2.7 Text-to-Image Pro generates high-quality images up to 4K from text prompts with thinking mode for enhanced image quality. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Video Extend
WAN 2.7 Video Extend extends existing videos with optional last frame control and audio support, supporting 720p/1080p output. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

Reference To Video
WAN 2.7 Reference-to-Video turns character, prop, or scene references from images or videos into new video shots with preserved identity, style, and layout plus smooth, coherent motion. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

Text To Video
WAN 2.7 Text-to-Video turns plain prompts into coherent, cinematic clips with crisp detail, stable motion, and strong instruction-following—great for ads, explainers, and social posts. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

Image Edit Pro
WAN 2.7 Image Edit Pro performs prompt-driven image editing with multi-image reference support and up to 2K output. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Video Edit
WAN 2.7 Video Edit performs prompt-driven video editing with multi-image reference support, supporting 720p/1080p output. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Image Edit
WAN 2.7 Image Edit performs prompt-driven image editing with support for multiple-image references. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Image To Video
WAN 2.7 converts images into videos (720p/1080p) with optional audio, supporting first and last frame control. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.
See Wan 2.7 in action
Real outputs generated by the Wan 2.7 API. Hover any video to preview, click to open the full-size viewer.
How to use the Wan 2.7 API
Four steps from signup to a finished generation. Full Python, Node.js, and cURL examples are in the API section below.
- 1
Get an API key
Sign up for a WaveSpeedAI account and copy your API key from the dashboard. New accounts come with free starter credits — enough to run the playground a few dozen times before billing kicks in.
- 2
Submit a prediction
POST your input as JSON to https://api.wavespeed.ai/api/v3/alibaba/wan-2.7/text-to-video. The endpoint returns a prediction id immediately — generations are async so you don't hold an open connection during inference.
- 3
Poll for completion
GET https://api.wavespeed.ai/api/v3/predictions/{request_id}/result every 1-2 seconds. The response includes a status field; keep polling until it flips from "queued" or "processing" to "completed".
- 4
Read the output URL
Once status is "completed", read the URL from data.outputs[0]. The URL points to your generated media on the WaveSpeedAI CDN — image, video, audio, or 3D file depending on the Wan 2.7 variant you called.
What you can build with Wan 2.7
Common workflows developers and creators use the Wan 2.7 API for.
Cinematic short films at 1080p
Native 1080p output means generations are usable for delivery without an upscaling pass — clean for editorial, broadcast, and high-res social.
Multilingual content
Strong on both Chinese and English prompts — useful for studios producing content for cross-border markets without re-prompting per language.
Multi-modal composition
Combine text + reference image + reference video + reference audio in a single API call — useful for storyboard fills where the brief includes mood, character, and movement references.
Realistic physics scenes
Wan 2.7 has strong physical-world priors — objects fall, water flows, fabric drapes naturally — making it the right pick for product demos and simulation-style content.
Music videos with audio reference
Audio-conditioned generation — supply a track and a character/scene reference, get a music-aware video with rhythm-matched motion.
Tips for prompting Wan 2.7
Practical advice for getting better outputs from Wan 2.7 — drawn from the patterns that work across video models in production pipelines.
Be specific about camera moves
Mention concrete cinematography vocabulary — orbit, dolly-in, push-in, pan-left, crane shot, handheld follow. Generic prompts produce static or arbitrary camera choices; named camera moves map directly to motion intent in the model's training data and dramatically improve shot quality.
Anchor character identity with reference images
If your prompt depends on a specific person, character, or product, upload a reference image alongside the prompt. Without a reference, identity drifts across frames and across shots — the same character ends up looking like a slightly different person each generation.
Describe lighting and time of day
Lighting cues like 'golden hour, soft warm directional light' or 'overcast diffused light, slate-grey sky' improve quality and consistency far more than vague quality modifiers. Lighting is one of the strongest priors the model conditions on.
Use negative prompts to suppress common failure modes
Useful negatives for video: 'frame flicker, motion blur, watermark, text artifacts, distorted hands, low resolution, jpeg compression'. Negative prompts cost nothing and noticeably reduce the rate of generations you'd otherwise re-roll.
Pick the shortest duration that captures your beat
Most prompts work best at 5-8 seconds. Longer clips amplify temporal inconsistencies (subject morphing, environment drift). If you need a 20-second sequence, generate three 6-8 second clips and edit them together — quality stays higher than one long generation.
Match aspect ratio to platform up front
9:16 for TikTok / Reels / Shorts, 16:9 for landscape feeds and YouTube, 1:1 for post grids. Models train slightly differently per aspect ratio — cropping a 16:9 to 9:16 after the fact loses both fidelity and the composition the model intended.
Wan 2.7 API pricing
Pricing is per-output. The final charge scales with the parameters you set in each variant's playground (resolution, duration, output count, references).
| Endpoint | Type | Starting price |
|---|---|---|
| alibaba/wan-2.7/text-to-image | text-to-image | $0.030 |
| alibaba/wan-2.7/text-to-image-pro | text-to-image | $0.075 |
| alibaba/wan-2.7/video-extend | video-extend | $0.50 |
| alibaba/wan-2.7/reference-to-video | image-to-video | $0.50 |
| alibaba/wan-2.7/text-to-video | text-to-video | $0.50 |
| alibaba/wan-2.7/image-edit-pro | image-to-image | $0.075 |
| alibaba/wan-2.7/video-edit | video-to-video | $0.50 |
| alibaba/wan-2.7/image-edit | image-to-image | $0.030 |
| alibaba/wan-2.7/image-to-video | image-to-video | $0.50 |
Call the Wan 2.7 API
Sign up for an API key at wavespeed.ai/accesskey, then submit a prediction via REST. The playground generates ready-to-paste samples for any combination of inputs.
HTTP example
# 1. Submit a prediction
curl -X POST "https://api.wavespeed.ai/api/v3/alibaba/wan-2.7/text-to-video" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $WAVESPEED_API_KEY" \
-d '{}'
# 2. Poll the result until status = "completed"
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
-H "Authorization: Bearer $WAVESPEED_API_KEY"
# Read the output URL from data.outputs[0].Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY
const result = await client.run("alibaba/wan-2.7/text-to-video", {});
console.log(result.outputs[0]); // → URL of the generated outputPython example
# pip install wavespeed
import wavespeed
output = wavespeed.run(
"alibaba/wan-2.7/text-to-video",
{}
)
print(output["outputs"][0]) # → URL of the generated outputWan 2.7 vs alternatives
When to pick Wan 2.7 over similar models on WaveSpeedAI.
Wan 2.7 vs Seedance 2.0
Seedance 2.0 has stronger camera-motion language and native audio synthesis. Wan 2.7 wins on multi-modal input flexibility (especially audio reference) and 1080p resolution without upscaling.
Wan 2.7 vs Kling 3.0
Kling 3.0 generates longer takes (up to 30s) with strong character identity. Wan 2.7 has multi-modal input + open weights + native 1080p.
Wan 2.7 vs Wan 2.2
Wan 2.7 is the newer architecture with 1080p native and multi-modal input. Wan 2.2 (WaveSpeedAI-tuned variants) wins on specialized features: animate, video-edit, speech-to-video, LoRA training.
Wan 2.7 API — Frequently asked questions
Pricing, license, integration — common questions about running Wan 2.7 on WaveSpeedAI.
What is the Wan 2.7 API?
Wan 2.7 is a Alibaba video generation model exposed as a REST API on WaveSpeedAI. Alibaba Tongyi Lab Wan 2.7 — full-modality input (text, image, video, and audio references combined in a single generation), open weights, native 1080p output. You can call it programmatically or try it from the playground linked above.
How do I call the Wan 2.7 API?
Sign up for a WaveSpeedAI account, copy your API key from /accesskey, then POST to https://api.wavespeed.ai/api/v3/alibaba/wan-2.7/text-to-video with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to "completed", then read the output URL from data.outputs[0]. Full Python / Node.js / cURL examples are above.
How much does the Wan 2.7 API cost?
Wan 2.7 starts at $0.030 per run. The exact cost scales with the parameters you set (resolution, duration, output count, references). The live cost preview next to the Generate button in the playground shows the exact price for your current input.
Which Wan 2.7 variants are available?
WaveSpeedAI hosts 9 Wan 2.7 endpoints: alibaba/wan-2.7/text-to-image, alibaba/wan-2.7/text-to-image-pro, alibaba/wan-2.7/video-extend, alibaba/wan-2.7/reference-to-video, alibaba/wan-2.7/text-to-video, alibaba/wan-2.7/image-edit-pro, alibaba/wan-2.7/video-edit, alibaba/wan-2.7/image-edit, and more. Each variant has its own playground page and pricing.
Can I use Wan 2.7 outputs commercially?
Commercial usage rights follow the Alibaba model license. Most Alibaba models permit commercial output use; see each model's playground page for the specific license summary, and WaveSpeedAI's Terms of Service for platform-level conditions.
Why use Wan 2.7 on WaveSpeedAI instead of going direct?
One API key + one billing account across Wan 2.7 AND 1,000+ other AI models from other providers. No per-vendor SDK setup, no separate rate-limit envelopes, no rewrite-per-vendor integration code. Pricing is typically at parity with or below Alibaba's direct API.
About Alibaba
The team behind Wan 2.7 and the broader Alibaba model lineup on WaveSpeedAI.
Alibaba's Tongyi Lab produces the Wan family of video models and the Qwen family of LLMs. Wan is notable for being released with open weights, full-modality input support (text, image, video, and audio references in a single generation), and consistent strength on physics, motion, and multilingual prompts.
Start building with Wan 2.7 on WaveSpeedAI
Free starter credits on signup. One API key across 1,000+ AI models from Alibaba and every other provider.