Seedance 2.0 | Special Offer ✦ 10% OFF NOW
หน้าแรก/สำรวจ/Wan 2.6 Models/alibaba/wan-2.6/reference-to-video

WAN 2.6

alibaba /

WAN 2.6 Reference-to-Video turns character, prop, or scene references—single or multi-view—into new video shots with preserved identity, style, and layout plus smooth, coherent motion. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

image-to-video
อินพุต

ลากและวางหรือคลิกเพื่ออัปโหลด

ลากและวางหรือคลิกเพื่ออัปโหลด

If set to true, the prompt optimizer will be enabled.

ว่าง

คำขอของคุณจะมีค่าใช้จ่าย $0.5 ต่อครั้ง

ด้วย $10 คุณสามารถเรียกใช้โมเดลนี้ได้ประมาณ 20 ครั้ง

อีกหนึ่งสิ่ง:

ตัวอย่างดูทั้งหมด

README

/ WAN 2.6 — Reference-to-Video (wan2.6-ref2v)

WAN 2.6 Reference-to-Video is ’s WanXiang 2.6 model for turning example videos + a text prompt into new shots. Provide up to two reference clips and the model learns their style, motion, and framing, then generates a new 5–10s video at up to 1080p.

🚀 Highlights

  • Reference-driven motion & style – Mimic camera moves, pacing and composition from your reference videos while following your prompt.
  • Up to two reference videos – Blend style from one clip and motion from another, or use different angles of the same scene.
  • Cinematic resolutions – Choose from 720p, or 1080p (portrait or landscape).
  • Story-aware generation – Works with prompt expansion and multishots to build richer, multi-shot sequences.
  • Audio-ready pipeline – Optional audio field for workflows that need motion aligned to external sound.

Output format: MP4 video at the selected size and duration.

🧩 Parameters

  • prompt* Text description of the new scene: characters, actions, environment, camera motion, mood, style, etc.

  • videos* 1–2 reference clips (URLs or uploads). These guide style, camera work, pacing, and motion structure.

  • negative_prompt Things to avoid, e.g. watermark, text, distortion, extra limbs.

  • audio (optional) External audio track for advanced pipelines where timing should loosely follow a given soundtrack. For most use cases you can leave this empty.

  • size One of the following resolution presets:

  • 1280×720 or 720×1280 → 720p

  • 1920×1080 or 1080×1920 → 1080p

  • duration Video length: 5 s or 10 s.

  • shot_type

  • single – Single-shot clip.

  • multi – When combined with enable_prompt_expansion, WAN 2.6 can break your idea into multiple shots of the same scene.

  • enable_prompt_expansion If enabled, ’s prompt optimizer expands short prompts into a richer internal script before generation.

  • seed Random seed. Set -1 for a new random result each time, or fix to a specific integer for reproducible layout and motion.

💰 Pricing

ResolutionSizes (W×H)5 s10 s
720p1280×720 / 720×1280$1.00$1.50
1080p1920×1080 / 1080×1920$1.50$2.25

✅ How to Use

  1. Prepare 1–2 reference videos
  • Clean motion, stable framing, and clear style work best.
  • You can use two angles of the same scene, or two stylistically similar clips.
  1. Write your prompt
  • Describe what should happen in the new video, not just what’s in the references.
  • Example: “Cyberpunk alley at night, hero walking toward camera, slow dolly-in, neon reflections on wet ground, cinematic color grading.”
  1. (Optional) Add a negative_prompt
  • Keep it short and focused: watermark, text, logo, extra limbs, low resolution.
  1. Choose size and duration
  • 720p/1080p according to your platform (Reels, TikTok, YouTube, etc.).
  • 5 s for quick shots, 10 s for more complex actions.
  1. Configure multishots & prompt expansion
  • Turn on enable_prompt_expansion for shorter prompts.
  • Enable multishots if you want WAN 2.6 to create a multi-shot sequence.
  1. Set seed (optional)
  • Use a fixed seed to iterate while keeping composition similar.
  1. Run the model and download the generated clip.

💡 Prompt & Reference Tips

  • Keep reference content and prompt aligned – if references show a city night scene, avoid asking for a sunny beach.

  • Use two references when you want to mix:

  • video A’s camera & motion + video B’s lighting/style.

  • Mention where you want the model to follow reference closely, e.g.: “Follow reference camera speed and angles, but change character outfit to futuristic armor.”

  • For portrait/vertical social content, select 480×832, 720×1280, or 1080×1920; for YouTube-style landscape, use the corresponding wide resolutions.

More Models to Try

  • vidu/reference-to-video-q2 Vidu’s Q2 reference-to-video model for turning style and motion from example clips into new shots, ideal for anime-style edits, trailers, and storyboards.

  • google/veo3.1/reference-to-video Google Veo 3.1 reference-conditioned video generator, designed for high-fidelity cinematic motion that closely follows your reference footage.

  • kwaivgi/kling-video-o1/reference-to-video Kwaivgi’s Kling Video O1 reference-to-video model, great for copying camera language and pacing from a sample clip while changing characters or scenes.

  • /seedance-v1-lite/reference-to-video SeeDance v1 Lite, a lightweight reference-to-video model for fast, style-consistent generations based on short example videos.