Home/Explore/OpenAI Sora-2 Models/openai/sora-2/image-to-video

image-to-video

openai/sora-2/image-to-video

OpenAI's Sora 2 is new state of the art video and audio generation model. Building on the foundation of Sora, this new model introduces capabilities that have been difficult for prior video models to achieve– such as more accurate physics, sharper realism, synchronized audio, enhanced steerability, and an expanded stylistic range.

Hint: You can drag and drop a file or click to upload

preview

Idle

Your request will cost $0.4 per run.

For $10 you can run this model approximately 25 times.

One more thing:

ExamplesView all

README

OpenAI Sora 2 — Image-to-Video

Turn a single reference image into a coherent video clip with synchronized audio. Built on Sora 2’s core advances, the image-to-video pipeline preserves identity, lighting, and composition while synthesizing believable motion and camera dynamics.

Why it looks great

  • Identity lock-in: preserves faces, style, textures, and scene layout from the reference image.
  • Parallax & depth hallucination: infers 3D structure for convincing foreground/background separation.
  • Physics-aware motion: contact, inertia, and secondary motion (hair, cloth) behave naturally.
  • Temporal consistency: minimal flicker/ghosting with stable subject features across frames.
  • Smart background extension: clean inpainting beyond the original frame for wider moves.
  • Cinematic camera moves: subtle pans, push-ins, arcs, and handheld vibes without warping.
  • Synchronized audio: optional voice/ambience that matches on-screen action and pacing.
  • Strong steerability: prompt edits and controls (duration, fps, motion strength) produce predictable changes.

How to Use

  1. Upload a single reference image (PNG/JPEG).
  2. Add a short prompt for mood, motion style, or camera behavior.
  3. Duration: choose 4s, 8s, or 12s.
  4. Submit the job; preview and download the result.

Pricing

DurationTotal ($)
4s0.40
8s0.80
12s1.20

Billing Rules: Linear pricing at $0.10/s. Available durations are 4s, 8s, and 12s.

Notes

  • Best results come from high-resolution, clean source images with clear subjects and lighting.
  • For big perspective shifts, start with shorter durations or lower motion strength, then iterate.
  • Ensure you own the rights to your image; outputs inherit input content constraints.
  • Please follow the user rules from OpenAI, you can find details in the reference: What images are permitted and prohibited in Sora-2