Create More, Spend Less — 15% OFF — Nano Bananan Pro & 2 · Seedream 4.5 · Wan 2.7
Home/Explore/Seedance 2.0 Models /bytedance/seedance-2.0/image-to-video

Seedance 2.0 Image-to-Video

bytedance /

Seedance 2.0 (Image-to-Video) generates Hollywood-grade cinematic videos from reference images and text prompts with native audio-visual synchronization, director-level camera and lighting control, and exceptional motion stability. Built on Seed's unified multimodal architecture, it preserves the input image's subject and composition while adding expressive, physically accurate motion.

image-to-video
Input

Drag & drop or click to upload

preview

Drag & drop or click to upload

Enable web search for real-time information.

Idle

Your request will cost $0.6 per run.

For $10 you can run this model approximately 16 times.

One more thing:

ExamplesView all

README

Seedance 2.0 Image-to-Video

Seedance 2.0 is Seed's latest video generation model, built on a unified multimodal architecture. The Image-to-Video mode generates production-grade cinematic videos from reference images and text prompts — preserving the input image's subject, composition, and style while adding expressive motion with native audio synchronization.

Key Features

  • Unified multimodal architecture A single model that handles text, image, audio, and video inputs for comprehensive creative flexibility.

  • Image-faithful generation Preserves the reference image's subject identity, composition, lighting, and style while animating it into motion.

  • Multi-image reference support Guide generation with up to 4 reference images for consistent style, characters, or scenes.

  • Native audio-visual synchronization Generates video with synchronized audio in a single pass.

  • Director-level control Granular control over camera movement, lighting, shadows, and character performance through prompts.

  • Exceptional motion stability Industry-leading motion coherence with stable subjects, consistent physics, and fluid transitions.

Parameters

ParameterRequiredDescription
promptYesDetailed description of the cinematic scene
imageYesStart image URL to guide the video generation
last_imageNoLast frame image URL for video continuation
durationNoVideo length in seconds: 4-15 (default: 5)
aspect_ratioNoOutput format: 16:9, 9:16, 4:3, 3:4, 1:1, 21:9 (default: adaptive)
resolutionNoOutput resolution: 480p, 720p (default), or 1080p

How to Use

  1. Upload a start image — provide an image to guide the video generation.
  2. Write your prompt — describe the scene with cinematic detail: action, camera movement, lighting, mood.
  3. Set duration — choose any duration from 4 to 15 seconds.
  4. Run — submit and download your cinematic video with synchronized audio.

Pricing

ResolutionDurationCost
480p5 s$0.60
480p10 s$1.20
480p15 s$1.80
720p5 s$1.20
720p10 s$2.40
720p15 s$3.60
1080p5 s$3.00
1080p10 s$6.00
1080p15 s$9.00

Prices scale linearly with duration (4-15 seconds).

Billing Rules

  • Base rate (480p): $0.60 per 5 seconds
  • 720p: 2x the 480p price
  • 1080p: 5x the 480p price (2.5x the 720p price)
  • Duration range: 4-15 seconds (continuous)

Best Use Cases

  • Product Demos — Animate product shots into cinematic showcase videos.
  • Ad Creatives — Turn storyboard frames into polished commercial footage.
  • Character Animation — Bring character art or portraits to life with natural motion.
  • Scene Extension — Transform a single keyframe into a full cinematic sequence.
  • Style-Consistent Series — Use reference images to maintain visual consistency across multiple clips.

Pro Tips

  • Upload high-quality reference images for the best subject preservation.
  • Write prompts like a film director — include lighting, camera angles, and mood.
  • Use multiple reference images for better style and character consistency.
  • Start with a short duration (4-5s) to iterate, then extend up to 15s for the final cut.
  • Describe character expressions and actions for more engaging scenes.

Notes

  • Native audio generation is included — videos come with synchronized sound.
  • Up to 4 reference images can be uploaded.
  • Duration range: 4-15 seconds (continuous).
  • Aspect ratio follows the input image composition.

Related Models