Nano Banana 2 & Pro Sale — 15% OFF | Apr 1–15 Only
Inicio/Explorar/Grok Models/x-ai/grok-imagine-video/reference-to-video

X-AI Grok Imagine Video

x-ai/grok-imagine-video/reference-to-video

X-AI Grok Imagine Video Reference-to-Video generates videos from multiple reference images with preserved identity, style, and scene composition. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-video
Input

Drag & drop or click to upload

preview

Drag & drop or click to upload

preview

Idle

Tu solicitud costará $0.05 por ejecución.

Con $1 puedes ejecutar este modelo aproximadamente 20 veces.

Una cosa más:

EjemplosVer todo

README

Grok Imagine Video Reference-to-Video

Grok Imagine Video Reference-to-Video is X-AI's multi-image reference model that generates videos from up to 7 reference images. Provide reference images and describe the desired motion — the model generates a video that preserves the identity, style, and composition from your references with smooth, natural movement.

Why Choose This?

  • Multi-image reference Use up to 7 reference images to guide video generation with rich visual context.

  • Identity preservation Characters, objects, and scenes maintain consistent appearance across generated frames.

  • Flexible duration Generate videos at 6 or 10 seconds to match your scene pacing.

  • Resolution options Output in 720p or 480p based on your quality and speed requirements.

Parameters

ParameterRequiredDescription
imagesYesArray of reference image URLs (1-7 images).
promptYesText description of the desired motion, camera movement, and scene.
durationNoVideo length in seconds. Options: 6, 10.
resolutionNoOutput resolution: 720p (default) or 480p.

How to Use

  1. Upload your reference images — provide 1 to 7 reference images via URL or drag-and-drop upload.
  2. Write your prompt — describe the motion, camera movement, and scene details. Reference the uploaded images in your prompt using @image1, @image2, etc.
  3. Set duration — choose 6 or 10 seconds based on your scene length.
  4. Select resolution — 720p for higher quality, 480p for faster processing.
  5. Run — submit and download your video.

Pricing

DurationCost
6s$0.30
10s$0.50

Billing Rules

  • Rate: $0.05 per second
  • Duration options: 6 or 10 seconds
  • Billing is based on the selected duration, not actual playback length

Best Use Cases

  • Character Consistency — Generate videos with consistent character appearance across multiple shots using reference images.
  • Product Showcases — Create dynamic product videos from multiple product photos.
  • Multi-angle References — Use different angles of the same subject to generate richer, more accurate video.
  • Social Media Content — Create engaging video clips from image collections for Reels, TikTok, and Shorts.
  • Creative Projects — Combine multiple visual references to create unique video compositions.

Pro Tips

  • Use high-quality, well-lit reference images for better identity preservation.
  • Reference uploaded images in your prompt using @image1, @image2, etc. for precise control.
  • Keep reference content and prompt aligned — if references show a character, describe that character's actions.
  • Start with fewer references and add more if needed for richer context.
  • Use 6-second generations to test your prompt before committing to 10 seconds.

Notes

  • Both images and prompt are required fields.
  • Up to 7 reference images are supported.
  • Ensure image URLs are publicly accessible.
  • Maximum duration is 10 seconds.

Related Models