bannerbanner
Join Waitlist
Home/Explore/Wan 2.2 Video Models/wavespeed-ai/wan-2.2/fun-control

video-to-video

Wan 2.2 Fun Control | AI Avatar & Video-To-Video Generation With 720P Support | WaveSpeedAI

wavespeed-ai/wan-2.2/fun-control

Wan2.2-Fun-Control uses Control Codes and multi-modal inputs to generate preset-controlled videos up to 120s at 720p; released under Apache 2.0 for commercial use. Ready-to-use REST API, no coldstarts, affordable.

Hint: You can drag and drop a file or click to upload

preview

Hint: You can drag and drop a file or click to upload

Idle

Your request will cost $0.2 per run.

For $10 you can run this model approximately 50 times.

One more thing::

ExamplesView all

README

Wan2.2-Fun-Control

Wan2.2-Fun-Control is an advanced video generation and control model developed by the Alibaba PAI team, designed for precise and creative video synthesis. By integrating Control Codes with deep learning and multi-modal conditioning, it enables users to direct motion, structure, and scene composition — achieving controllable, high-fidelity video generation under customizable guidance.

🌟 Key Features

  • 🎛️ Multi-Modal Control Supports multiple input types for fine-grained video control:

    • Canny: Edge or line-art guidance
    • Depth: Depth map-based spatial control
    • OpenPose: Human pose and skeletal motion tracking
    • MLSD: Geometric line structure for scene layout
    • Trajectory Control: Object or camera movement path conditioning
  • 🎬 High-Quality Video Generation Built on the Wan 2.2 architecture — delivering cinematic, high-resolution video outputs with stable motion and consistent identity.

  • 🌍 Multi-Language Prompting Accepts both Chinese and English descriptions for flexible creative control.

  • đź§  Intelligent Composition Aligns user-provided references (images or frames) with pose, structure, and scene layout to ensure natural transitions and realism.

đź’° Pricing

ResolutionCost per 5 SecondsMax Duration
480p$0.20120 seconds
720p$0.40120 seconds

Billing Rules

  • Standard Rate: $0.04 per second
  • HD (720p) Rate: $0.08 per second
  • Minimum Charge: All audio is billed for a minimum of 5 seconds.
    • Standard: $0.20
    • HD (720p): $0.40
  • Billing Cap: To keep your costs predictable, billing is capped at a maximum of 600 seconds (10 minutes).

⚙️ Usage Tips

  • đź§Ť Keep reference consistency: The reference image’s composition, pose, and camera angle should match the desired video framing. Major mismatches between input and control maps (e.g., OpenPose or Canny) can lead to generation instability or artifacts.

  • 🖼️ Match aspect ratios: The aspect ratio of the input image and target video should remain identical for best results.

  • 🔄 Control balance: Combining too many control types simultaneously may reduce creative flexibility — start with one or two controls and tune gradually.