Browse ModelsX AIX AI Grok Imagine Video Reference To Video

X Ai Grok Imagine Video Reference To Video

X Ai Grok Imagine Video Reference To Video

Playground

Try it on WavespeedAI!

X-AI Grok Imagine Video Reference-to-Video generates videos from multiple reference images with preserved identity, style, and scene composition. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

Grok Imagine Video Reference-to-Video

Grok Imagine Video Reference-to-Video is X-AI’s multi-image reference model that generates videos from up to 7 reference images. Provide reference images and describe the desired motion — the model generates a video that preserves the identity, style, and composition from your references with smooth, natural movement.


Why Choose This?

  • Multi-image reference Use up to 7 reference images to guide video generation with rich visual context.

  • Identity preservation Characters, objects, and scenes maintain consistent appearance across generated frames.

  • Flexible duration Generate videos at 6 or 10 seconds to match your scene pacing.

  • Resolution options Output in 720p or 480p based on your quality and speed requirements.


Parameters

ParameterRequiredDescription
imagesYesArray of reference image URLs (1-7 images).
promptYesText description of the desired motion, camera movement, and scene.
durationNoVideo length in seconds. Options: 6, 10.
resolutionNoOutput resolution: 720p (default) or 480p.

How to Use

  1. Upload your reference images — provide 1 to 7 reference images via URL or drag-and-drop upload.
  2. Write your prompt — describe the motion, camera movement, and scene details. Reference the uploaded images in your prompt using @image1, @image2, etc.
  3. Set duration — choose 6 or 10 seconds based on your scene length.
  4. Select resolution — 720p for higher quality, 480p for faster processing.
  5. Run — submit and download your video.

Pricing

DurationCost
6s$0.30
10s$0.50

Billing Rules

  • Rate: $0.05 per second
  • Duration options: 6 or 10 seconds
  • Billing is based on the selected duration, not actual playback length

Best Use Cases

  • Character Consistency — Generate videos with consistent character appearance across multiple shots using reference images.
  • Product Showcases — Create dynamic product videos from multiple product photos.
  • Multi-angle References — Use different angles of the same subject to generate richer, more accurate video.
  • Social Media Content — Create engaging video clips from image collections for Reels, TikTok, and Shorts.
  • Creative Projects — Combine multiple visual references to create unique video compositions.

Pro Tips

  • Use high-quality, well-lit reference images for better identity preservation.
  • Reference uploaded images in your prompt using @image1, @image2, etc. for precise control.
  • Keep reference content and prompt aligned — if references show a character, describe that character’s actions.
  • Start with fewer references and add more if needed for richer context.
  • Use 6-second generations to test your prompt before committing to 10 seconds.

Notes

  • Both images and prompt are required fields.
  • Up to 7 reference images are supported.
  • Ensure image URLs are publicly accessible.
  • Maximum duration is 10 seconds.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/x-ai/grok-imagine-video/reference-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "duration": 6,
    "resolution": "720p"
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
imagesarrayYes[]1 ~ 7 itemsArray of reference image URLs for video generation. Up to 7 images supported.
promptstringYes-Text description of desired motion or changes in the video.
durationintegerNo66, 10Video duration in seconds.
resolutionstringNo720p720p, 480pResolution of the output video.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Request Parameters

ParameterTypeRequiredDefaultDescription
idstringYes-Task ID

Result Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
dataobjectThe prediction data object containing all details
data.idstringUnique identifier for the prediction, the ID of the prediction to get
data.modelstringModel ID used for the prediction
data.outputsstringArray of URLs to the generated content.
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds
© 2025 WaveSpeedAI. All rights reserved.