Nvidia Cosmos 3 Super Image To Video

Playground

NVIDIA Cosmos 3 Super Image to Video is a fast AI image-to-video generation model that creates high-quality videos from a first-frame image and a motion prompt. Ready-to-use REST inference API for animating images, product videos, cinematic clips, social media content, advertising creatives, concept videos, and professional image-to-video workflows with simple integration, no coldstarts, and affordable pricing.

Features

NVIDIA Cosmos 3 Super Image-to-Video generates short videos from a reference image and a natural-language prompt. It supports motion prompting, negative prompting, size presets, duration selection, inference-step tuning, and guidance scaling for high-quality image-driven video generation.

Why Choose This?

Image-guided video generation Start from a single reference image and animate it into a video clip.
Prompt-based motion control Describe motion, camera movement, atmosphere, and scene behavior using natural language.
Negative prompt support Use negative_prompt to steer the model away from unwanted content or artifacts.
Flexible size presets Generate videos in common aspect ratios such as 16:9, 1:1, and 9:16.
Simple duration control Choose a fixed output duration from 1 to 7 seconds.
Production-ready API Suitable for concept visualization, animated keyframes, creator content, marketing clips, and short cinematic sequences.

Parameters

Parameter	Required	Description
prompt	Yes	Text prompt describing the motion and scene of the video to generate.
image	Yes	First-frame image for the generated video.
negative_prompt	No	Content to steer the generation away from.
size	No	Output video size preset. Supported values: `16:9`, `4:3`, `1:1`, `3:4`, `9:16`. Default: `16:9`.
duration	No	Output video duration in seconds. Supported values: `1`, `2`, `3`, `4`, `5`, `6`, `7`. Default: `7`.
num_inference_steps	No	Number of denoising steps.
guidance_scale	No	Classifier-free guidance scale.

How to Use

Upload the reference image — provide the image you want to animate.
Write the prompt — describe the desired motion, camera action, and scene evolution.
Add a negative prompt (optional) — specify things you want to avoid.
Choose the output size — select the aspect ratio that matches your intended output.
Set duration (optional) — choose a fixed output length between 1 and 7 seconds.
Tune generation settings (optional) — adjust num_inference_steps and guidance_scale if needed.
Submit — run the model and download the generated video.

Example Prompt

A cinematic slow push-in as the subject turns slightly toward the camera, soft wind movement in the hair, subtle background motion, realistic lighting, polished commercial look

Pricing

Pricing is based on the selected duration.

Duration	Cost
1 second	$0.05
2 seconds	$0.10
3 seconds	$0.15
4 seconds	$0.20
5 seconds	$0.25
6 seconds	$0.30
7 seconds	$0.35

Billing Rules

Pricing is $0.05 per second
Billing follows the selected duration
Minimum billed duration is 1 second
Maximum billed duration is 7 seconds
size, negative_prompt, num_inference_steps, and guidance_scale do not directly affect pricing

Best Use Cases

Image-to-video animation — Turn a still image into a dynamic short clip.
Concept visualization — Explore motion ideas from a single frame.
Social media content — Create short animated content from static artwork or portraits.
Marketing creatives — Animate posters, product shots, or character stills.
Cinematic prototyping — Test prompt-driven motion, framing, and visual storytelling.

Pro Tips

Use a clear, high-quality reference image for better motion stability.
Keep prompts focused on motion, camera behavior, and scene change rather than restating static details already visible in the image.
Match the input image aspect ratio to the selected size whenever possible.
If the input image ratio does not match the selected size, the result may appear stretched or distorted.
Start with the default settings first, then tune guidance_scale or num_inference_steps only if needed.
Use negative_prompt to reduce unwanted artifacts or style drift.

Notes

prompt and image are required.
duration is selected directly from 1 to 7 seconds.
size defaults to 16:9.
Pricing depends only on the selected duration.
For best results, the input image ratio should match the selected size ratio to avoid distortion.

NVIDIA Cosmos 3 Super Text-to-Image — Generate the source image first, then animate it.
Pruna AI P-Video Animate — Animate a reference image using motion from a source video.
Skywork AI SkyReels V4 Image-to-Video — Another image-to-video workflow with additional motion-generation options.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result

set -euo pipefail

export WAVESPEED_API_KEY="your-api-key"

REQUEST_BODY=$(cat <<'JSON'
{
  "prompt": "A cinematic ocean wave at sunrise, highly detailed",
  "image": "https://interactive-examples.mdn.mozilla.net/media/cc0-images/painted-hand-298-332.jpg",
  "size": "16:9",
  "duration": "7",
  "num_inference_steps": 28,
  "guidance_scale": 6
}
JSON
)

# 1. Submit the prediction.
SUBMIT_RESPONSE=$(curl --silent --show-error --fail-with-body \
  -X POST "https://api.wavespeed.ai/api/v3/nvidia/cosmos-3-super/image-to-video" \
  -H "Authorization: Bearer ${WAVESPEED_API_KEY}" \
  -H "Content-Type: application/json" \
  -d "${REQUEST_BODY}")

TASK=$(printf '%s' "${SUBMIT_RESPONSE}" | jq 'if type == "object" and has("data") then .data else . end')
PREDICTION_ID=$(printf '%s' "${TASK}" | jq -r '.id // empty')
if [ -z "${PREDICTION_ID}" ]; then
  printf 'Submission response did not contain a prediction id
' >&2
  exit 1
fi
RESULT_URL=$(printf '%s' "${TASK}" | jq -r '.urls.get // empty')
if [ -z "${RESULT_URL}" ]; then RESULT_URL="https://api.wavespeed.ai/api/v3/predictions/${PREDICTION_ID}/result"; fi

# 2. Poll until the prediction finishes.
while true; do
  RESPONSE=$(curl --silent --show-error --fail-with-body \
    "${RESULT_URL}" \
    -H "Authorization: Bearer ${WAVESPEED_API_KEY}")
  RESULT=$(printf '%s' "${RESPONSE}" | jq 'if type == "object" and has("data") then .data else . end')
  STATUS=$(printf '%s' "${RESULT}" | jq -r '.status // empty')

  case "${STATUS}" in
    completed) printf '%s\n' "${RESULT}" | jq '.outputs'; break ;;
    failed|cancelled|timeout) printf '%s\n' "${RESULT}" | jq . >&2; exit 1 ;;
    created|processing) sleep 2 ;;
    *) printf 'Unexpected status: %s
' "${STATUS}" >&2; exit 1 ;;
  esac
done

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
prompt	string	Yes		-	Text prompt describing the motion and scene of the video to generate.
image	string	Yes		-	First-frame image for the generated video.
negative_prompt	string	No		-	Content to steer the generation away from.
size	string	No	16:9	16:9, 4:3, 1:1, 3:4, 9:16	Output video size preset.
duration	string	No	7	1, 2, 3, 4, 5, 6, 7	Output video duration in seconds.
num_inference_steps	integer	No	28	1 ~ 50	Number of denoising steps.
guidance_scale	number	No	6	0 ~ 20	Classifier-free guidance scale.

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Output values, usually URL strings; some models return text strings or structured result objects (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction
data.model	string	Model ID used for the prediction
data.outputs	array<string \| object>	Array of generated outputs (empty when status is not completed). Items are usually URL strings, but may be text strings or structured result objects, depending on the model.
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to poll for the prediction result
data.status	string	Status: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Nvidia Chrono Edit Nvidia Cosmos 3 Super Text To Image

Nvidia Cosmos 3 Super Image To Video

Playground

Features

Why Choose This?

Parameters

How to Use

Example Prompt

Pricing

Billing Rules

Best Use Cases

Pro Tips

Notes

Related Models

Authentication

API Endpoints

Submit Task & Query Result

Parameters

Task Submission Parameters

Request Parameters

Response Parameters

Result Request Parameters

Result Response Parameters