Wan 2.6: 영화적 일관성과 완벽한 립싱크를 갖춘 15초 AI 비디오.

Alibaba의 차세대 비디오 모델—더 스마트한 프롬프트, 향상된 오디오 동기화, 그리고 비교할 수 없는 캐릭터 일관성.

지금 Wan 2.6 체험하기 문서 확인

체험해 보기

Text to Video

Image to Video

Reference to Video

프롬프트

생성하기

핵심 기능

다중 샷 내러티브 생성

대부분의 오픈소스 비디오 모델은 단일 연속 클립을 생성하며, 종종 구조나 일관성이 부족합니다. WAN 2.6은 간단한 프롬프트에서 직접 다중 샷 내러티브를 생성하는 기능으로 주요 돌파구를 도입했습니다.

시작하기

Prompt

The scene unfolds in first-person POV inside a bright, refined modern kitchen. Natural daylight pours across walnut flooring and matte gray cabinetry, giving the space a calm and polished atmosphere. The viewer takes three to four slow, steady steps forward while holding an empty celadon-green porcelain bowl with both black-gloved hands. Ahead stands a built-in double-door refrigerator. The left door features a softly glowing dispenser slot, with faint vapor curling from its edges. When the viewer reaches the refrigerator and lifts the bowl beneath the outlet, a gentle mechanical hum begins. From the small dispenser opening, the plating sequence unfolds with precise, almost ritualistic elegance. First, a smooth stream of deep orange lobster bisque flows into the bowl, circling and rippling as it settles. Moments later, tender pieces of lobster claw and tail meat descend into the center, their pink-red surfaces glistening in the hot broth. A thin ribbon of cream follows, tracing a delicate spiral across the bisque. Finally, micro herbs and tiny gold flakes drift down, completing the dish with a soft visual flourish. The celadon glaze of the bowl reflects the bright natural light, while the warm tones of the bisque shimmer gently on the surface. Subtle sounds fill the space: soft footsteps on the wooden floor, the quiet friction of gloves against the bowl, the rising hum of the refrigerator, the thick pour of bisque hitting the ceramic, the gentle plop of lobster pieces, the light drizzle of cream, and the faint sprinkle of herbs and flakes. Altogether, the moment blends mechanical precision with the warmth and intimacy of fine dining, presented through the calm rhythm of first-person ASMR realism.

Final outcome

참조 기반 비디오 생성

WAN 2.6은 비디오 참조 생성을 지원하여 사용자가 입력 비디오로 모델을 안내할 수 있습니다.

시작하기

Prompt

character1 is eating dinner with character2 in a restaurant

Final outcome

15초 긴 비디오 생성

많은 오픈소스 모델은 매우 짧은 비디오(일반적으로 2-5초만) 생성에 제한되어 내러티브 깊이를 제한합니다. WAN 2.6은 최대 15초 길이의 비디오를 지원하여 이 장벽을 깨뜨립니다.

시작하기

Prompt

Generate an approximately 15-second cohesive narrative video. Story: A medieval knight awakens on a storm-swept meadow after a fierce battle. First 5 seconds: A slow circling shot reveals his mud-covered armor, scattered debris, and lingering flashes of lightning in the dark sky. Middle 5 seconds: The knight rises, grasping a sword embedded in the ground. The camera pulls upward from a low angle, emphasizing the determination in his eyes. Final 5 seconds: He begins running toward a distant ruined castle wall as the camera follows in a handheld-style tracking motion, tall grass brushing past the lens to create dynamic depth of field. Maintain scene continuity, natural body motion, and cinematic epic atmosphere throughout.

Final outcome

Articles about Wan 2.6

Unlocking Next-Gen Video Creation with Alibaba WAN 2.6 on WaveSpeedAI

As AI video generation continues to evolve at a rapid pace, Alibaba’s WAN 2.6 model stands out as one of the most advanced open-source solutions available today. Now launched on WaveSpeedAI, WAN 2.6 empowers creators with stronger storytelling abilities, smarter reference-driven generation, and longer, more expressive outputs.

Read

Q & A

어떤 입력 형식을 지원하나요?

일반적인 비디오 형식(예: MP4/MOV)이 지원됩니다. 최상의 결과를 얻으려면 안정적인 조명으로 명확하고 정면을 향한 피사체를 사용하세요.

정체성과 배경을 보존하나요?

요청된 얼굴 움직임과 입 모양 움직임을 적용하면서 정체성 일관성과 장면 일관성을 우선시합니다.

감정과 말하기 스타일을 제어할 수 있나요?

네. 프롬프트 및/또는 참조 오디오를 통해 강도(차분함/중립/활기참), 템포, 표현 강도를 안내할 수 있습니다.

한 프레임에서 여러 사람/얼굴을 처리할 수 있나요?

말하는 피사체가 명확하고 일관되게 보일 때 가장 잘 작동합니다. 참고: 혼잡한 장면이나 빈번한 가림은 드리프트를 유발할 수 있습니다—크롭하거나 대상 얼굴에 집중하는 것을 고려하세요.