OpenAI Sora 2 Image-to-Video Pro creates physics-aware, realistic videos with synchronized audio and greater steerability. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Bereit
$1.2pro Durchlauf
Action: She opened her hands Ambient Sound: The soft crackling of the dying fire in the oven; a high-pitched, happy little ding sound from the timer; the warm, persistent sizzle of butter melting on a nearby stovetop. Character Dialogue: (Voice is high-pitched, bubbly, and enthusiastic) "Welcome to my bakery!"
Action: The tortoise slowly raises its head, and its crystal shell catches the sunlight, momentarily casting a rainbow of light across the forest. It then closes its eyes as a tiny puff of magical mist rises from its back. Ambient Sound: The soft, constant drip-drip-drip of water filtering down through the cavern rocks; the low, deep rumble that comes from the tortoise's chest (a protective resonance); gentle wind chimes sound whenever the mist appears. Character Dialogue: (Voice is slow, ancient, and deep like moving earth) "Be still, little one. The forest remembers. All things are safe beneath the roots of the world."
Action: The character slowly unrolls the scroll, sighs softly, and uses a single finger to gently trace the fading characters on the parchment. He then looks up with a serene expression. Ambient Sound: The soft, rustling sound of silk as the scroll moves; the gentle, intermittent plink of cherry blossoms falling onto the stone ground; the very distant, calming trickle of a stream somewhere down the mountain. Character Dialogue: (Voice is calm, deep, and slightly resonant with age) "Patience is the truest form of power. All knowledge, like these blooms, returns to the earth in time. Observe and learn."
Action: He stops, lowers his gaze to the ground, and lets out a slow breath of cold air that briefly obscures his face before gripping the sword hilt tightly. Ambient Sound: The low, mournful howl of the wind sweeping through the pines; the crisp, soft crunch of boots on frozen gravel; the sharp, clear shing sound as the steel blade is drawn.
A nostalgic, rhythmic mood, with a slow, continuous circular orbit shot around the blurred record, emphasizing its steady rotation.
An aggressive, rapid motion, forward, with the tires spinning instantly into a high-speed blur, and the camera pulling back quickly (fast dolly out) as if accelerating away.
Action: The cube slowly spins faster, and the glowing runes pulse brightly for a moment, illuminating a dusty floor before returning to its steady, slow rotation. Ambient Sound: A deep, sustained electronic hum (the core power source); a very subtle, rhythmic tick-tock sound like an old clock deep within the mechanism; the faint echo of dripping water somewhere off-screen. Character Dialogue: (Voice is calm, synthesized, and androgynous) "Initiating sequence... Primary function: observation. Access denied to unauthorized entities. Remain dormant."
Action: The drone’s fins adjust slightly to maintain position, and its single robotic "eye" (lens) zooms in on a piece of strange, unknown wreckage in the gloom. A small puff of exhaust bubbles rises to the top of the frame. Ambient Sound: A constant, low-frequency sonar ping sound (slow and steady); muffled, bubbling noises from the drone's movement; the heavy, crushing silence of the deep ocean that dominates the background.
A delicate, ephemeral motion, with the dew droplets slowly beginning to slide down the petal, and a micro-level, gentle push-in (dolly in).
A wild, free mood, with a gentle, continuous horizontal pan (pan left or right) across the blurred grass, simulating the wind's uninterrupted flow.
a short prompt for mood, motion style, or camera behavior: a moody, quiet atmosphere, with a slow, subtle forward tracking shot (dolly in) towards the largest reflection, capturing the steaming manholes.
Notice — Service Stability
The Sora 2 family is currently unstable. Generations may fall back to alternative models without notice and the service can be temporarily unavailable. OpenAI is also expected to discontinue this model in the future.
If you need an equally capable, stable alternative, we recommend Seedance 2: bytedance/seedance-2.0/image-to-video.
Sora 2 Image-to-Video Pro is OpenAI's premium image animation model. Upload an image and describe the motion — AI transforms your still photo into a cinematic video with physics-aware movement, synchronized audio, and professional-grade quality.
Premium quality Higher fidelity output with enhanced detail preservation and motion coherence.
Physics-aware motion Learns contact, inertia, and momentum so objects move and collide believably.
Synchronized audio Generates matching audio — ambient sounds, dialogue, and sound effects.
Temporal consistency Stable identities, minimal flicker/ghosting, and clean frame-to-frame transitions.
Resolution options Output in 720p or 1080p for high-definition results.
Extended duration Generate videos up to 20 seconds long.
| Parameter | Required | Description |
|---|---|---|
| image | Yes | Source image to animate |
| prompt | Yes | Describe the motion, action, and audio cues |
| resolution | No | Output resolution: 720p or 1080p |
| duration | No | Video length: 4, 8, 12, 16, or 20 seconds |
| Duration | 720p | 1080p |
|---|---|---|
| 4 s | $1.20 | $2.00 |
| 8 s | $2.40 | $4.00 |
| 12 s | $3.60 | $6.00 |
| 16 s | $4.80 | $8.00 |
| 20 s | $6.00 | $10.00 |
Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/openai/sora-2/image-to-video-pro with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Sora 2 Image To Video Pro below.
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/openai/sora-2/image-to-video-pro" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $WAVESPEED_API_KEY" \
-d '{
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"image": "https://example.com/your-input.jpg",
"resolution": "720p",
"duration": 4
}'
# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
-H "Authorization: Bearer $WAVESPEED_API_KEY"
# When status is "completed", read the output from data.outputs[0].// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env
const result = await client.run("openai/sora-2/image-to-video-pro", {
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"image": "https://example.com/your-input.jpg",
"resolution": "720p",
"duration": 4
});
console.log(result.outputs[0]); // → URL of the generated output# pip install wavespeed
import wavespeed
output = wavespeed.run(
"openai/sora-2/image-to-video-pro",
{
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"image": "https://example.com/your-input.jpg",
"resolution": "720p",
"duration": 4
}
)
print(output["outputs"][0]) # → URL of the generated outputSora 2 Image To Video Pro is a OpenAI model for video generation from images, exposed as a REST API on WaveSpeedAI. OpenAI Sora 2 Image-to-Video Pro creates physics-aware, realistic videos with synchronized audio and greater steerability. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.
POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/openai/openai-sora-2-image-to-video-pro.
Sora 2 Image To Video Pro starts at $1.20 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.
Key inputs: `prompt`, `image`, `resolution`, `duration`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/openai/openai-sora-2-image-to-video-pro.
Average end-to-end generation time on WaveSpeedAI is around 271 seconds per request — measured across recent runs. Queue time scales with global demand; live status is visible in the prediction record.
Commercial usage rights depend on the model's license, set by its provider (OpenAI). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.