Wan 2.6：映画的コヒーレンスと完璧なリップシンクを備えた15秒のAI動画。

Alibabaの次世代動画モデル—よりスマートなプロンプト、強化されたオーディオ同期、そして比類のないキャラクター一貫性。

今すぐ Wan 2.6 を体験ドキュメントを見る

試してみる

Text to Video

Image to Video

Reference to Video

プロンプト

作成

主な機能

マルチショットナラティブ生成

ほとんどのオープンソース動画モデルは単一の連続クリップを生成し、構造や一貫性に欠けることがよくあります。WAN 2.6は、シンプルなプロンプトから直接マルチショットナラティブを生成する機能により、大きなブレークスルーを導入しました。

はじめる

Prompt

The scene unfolds in first-person POV inside a bright, refined modern kitchen. Natural daylight pours across walnut flooring and matte gray cabinetry, giving the space a calm and polished atmosphere. The viewer takes three to four slow, steady steps forward while holding an empty celadon-green porcelain bowl with both black-gloved hands. Ahead stands a built-in double-door refrigerator. The left door features a softly glowing dispenser slot, with faint vapor curling from its edges. When the viewer reaches the refrigerator and lifts the bowl beneath the outlet, a gentle mechanical hum begins. From the small dispenser opening, the plating sequence unfolds with precise, almost ritualistic elegance. First, a smooth stream of deep orange lobster bisque flows into the bowl, circling and rippling as it settles. Moments later, tender pieces of lobster claw and tail meat descend into the center, their pink-red surfaces glistening in the hot broth. A thin ribbon of cream follows, tracing a delicate spiral across the bisque. Finally, micro herbs and tiny gold flakes drift down, completing the dish with a soft visual flourish. The celadon glaze of the bowl reflects the bright natural light, while the warm tones of the bisque shimmer gently on the surface. Subtle sounds fill the space: soft footsteps on the wooden floor, the quiet friction of gloves against the bowl, the rising hum of the refrigerator, the thick pour of bisque hitting the ceramic, the gentle plop of lobster pieces, the light drizzle of cream, and the faint sprinkle of herbs and flakes. Altogether, the moment blends mechanical precision with the warmth and intimacy of fine dining, presented through the calm rhythm of first-person ASMR realism.

Final outcome

参照ベースの動画生成

WAN 2.6は動画参照生成をサポートし、ユーザーが入力動画でモデルをガイドできるようにします。

はじめる

Prompt

character1 is eating dinner with character2 in a restaurant

Final outcome

15秒の長い動画生成

多くのオープンソースモデルは非常に短い動画（通常2〜5秒のみ）の生成に制限されており、ナラティブの深さを制限しています。WAN 2.6は、最大15秒の動画をサポートすることで、この障壁を打ち破ります。

はじめる

Prompt

Generate an approximately 15-second cohesive narrative video. Story: A medieval knight awakens on a storm-swept meadow after a fierce battle. First 5 seconds: A slow circling shot reveals his mud-covered armor, scattered debris, and lingering flashes of lightning in the dark sky. Middle 5 seconds: The knight rises, grasping a sword embedded in the ground. The camera pulls upward from a low angle, emphasizing the determination in his eyes. Final 5 seconds: He begins running toward a distant ruined castle wall as the camera follows in a handheld-style tracking motion, tall grass brushing past the lens to create dynamic depth of field. Maintain scene continuity, natural body motion, and cinematic epic atmosphere throughout.

Final outcome

Articles about Wan 2.6

Unlocking Next-Gen Video Creation with Alibaba WAN 2.6 on WaveSpeedAI

As AI video generation continues to evolve at a rapid pace, Alibaba’s WAN 2.6 model stands out as one of the most advanced open-source solutions available today. Now launched on WaveSpeedAI, WAN 2.6 empowers creators with stronger storytelling abilities, smarter reference-driven generation, and longer, more expressive outputs.

Read

Q & A

どの入力形式をサポートしていますか？

一般的な動画形式（例：MP4/MOV）がサポートされています。最良の結果を得るには、安定した照明で明確な正面を向いた被写体を使用してください。

アイデンティティと背景を保持しますか？

リクエストされた顔の動きとリップの動きを適用しながら、アイデンティティの一貫性とシーンのコヒーレンスを優先します。

感情や話し方を制御できますか？

はい。プロンプトや参照オーディオを通じて、強度（穏やか/中性/エネルギッシュ）、テンポ、表現の強さをガイドできます。

1フレーム内の複数の人/顔を処理できますか？

話している被写体が明確で一貫して見える場合に最適に機能します。注意：混雑したシーンや頻繁なオクルージョンはドリフトを引き起こす可能性があります—クロップまたはターゲットの顔に焦点を当てることを検討してください。