WaveSpeedAI Desktop is Available Now!Try it
Home/Explore/Wan 2.6 Models/alibaba/wan-2.6/image-to-video
image-to-video

image-to-video

Alibaba WAN 2.6

alibaba/wan-2.6/image-to-video

Alibaba WAN 2.6 converts text or images into videos (720p/1080p) with synced audio, faster and more affordable than Google Veo3. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

Hint: You can drag and drop a file or click to upload

preview

Hint: You can drag and drop a file or click to upload

If set to true, the prompt optimizer will be enabled.

Idle

Your request will cost $0.5 per run.

For $10 you can run this model approximately 20 times.

One more thing::

ExamplesView all

README

Alibaba / WAN 2.6 — Image-to-Video (wan2.6-i2v)

WAN 2.6 Image-to-Video is Alibaba’s latest WanXiang 2.6 image-to-video model. Give it a single image plus a prompt and it generates a 5–15s cinematic clip, with support for multi-shot storytelling and up to 1080p resolution.

🚀 Highlights

  • Multi-shot narrative support – When prompt expansion + multi-shot are enabled, WAN 2.6 can automatically split your idea into several shots and keep key details consistent across them.
  • Longer clips – Generate videos up to 15 seconds, giving more room for story arcs, transitions, and character actions.
  • Flexible resolutions – Three quality tiers: 720p, 1080p, matching Alibaba’s official 2.6 spec.
  • Image-driven look – Uses your input frame as the visual anchor, then animates it according to your prompt.
  • Prompt-aware framing – The model balances your reference image and text description to keep identities, outfits, and overall scene coherent.

🧩 Parameters

  • image* – Required. The keyframe or base image to animate (URL or upload).

  • audio (optional) – Reserved field; can be used for advanced workflows that align motion with an external audio track. For normal use you can leave this empty.

  • prompt* – Describe the motion, story beats, camera moves, and style.

  • negative_prompt – Things to avoid (e.g. “watermark, text, distortion, extra limbs”).

  • resolution – One of:

    • 720p
    • 1080p
  • duration – One of 5s, 10s, 15s.

  • shot_type

    • single → single-shot clip.
    • multi → when prompt expansion is on, the model can break your prompt into multiple shots for a richer narrative.
  • enable_prompt_expansion – If enabled, WAN 2.6 will expand shorter prompts into a more detailed internal script before generating.

  • seed – Fix for reproducible results; set to -1 for random, or any integer to lock the layout and motion pattern.

Output: an MP4 video at the chosen resolution tier.

💰 Pricing

Resolution5 s10 s15 s
720p$0.50$1.00$1.50
1080p$0.75$1.50$2.25
  • 720p$0.10 / s
  • 1080p$0.15 / s

✅ How to Use

  1. Upload your image under image (clear subject, good lighting works best).

  2. Write a prompt describing:

    • what moves (character, camera, environment),
    • overall mood and style (e.g., “cinematic, soft lighting, shallow depth of field”).
  3. (Optional) Turn on enable_prompt_expansion if your prompt is short and you want the model to elaborate it.

  4. (Optional) Enable multishots to let WAN 2.6 build a multi-shot sequence instead of a single continuous shot.

  5. Choose resolution (720p / 1080p) and duration (5 / 10 / 15 s).

  6. Set seed if you want repeatable results, otherwise leave -1 for variation.

  7. Click Run and download your clip once it finishes.

💡 Prompt Tips

  • Start with the image content, then add motion: “Camera slowly dolly-in, character turns to look at the city, neon lights flicker, light rain, cinematic grade.”
  • For multi-shot stories, hint at structure: “Shot 1: wide city skyline at night; Shot 2: medium shot of the hero on the rooftop; Shot 3: close-up as they smile.”
  • Keep negative prompts short and focused; don’t overload them with long prose.

More Models to Try

  • kwaivgi/kling-video-o1/image-to-video High-quality AI image-to-video generator from Kwaivgi, ideal for cinematic character shots, smooth camera motion, and social-ready short clips.

  • alibaba/wan-2.5/image-to-video Alibaba’s WAN 2.5 image-to-video model, designed for fast, coherent animation of still images into ads, product demos, and story-style videos.

  • openai/sora-2/image-to-video OpenAI Sora 2, a cutting-edge AI video generator that turns images into long, detailed, physics-aware scenes for filmic concepts and high-end content.

  • google/veo3.1/image-to-video Google Veo 3.1 image-to-video, optimized for crisp, cinematic motion and clean compositions, perfect for marketing visuals, trailers, and creative storytelling.