Browse ModelsOpenaiOpenai Sora 2 Pro Text To Video

Openai Sora 2 Pro Text To Video

Openai Sora 2 Pro Text To Video

Playground

Try it on WavespeedAI!

OpenAI Sora 2 Pro is a state-of-the-art text-to-video model with realistic physics, synchronized audio, and strong steerability. Supports multiple resolutions up to 1080p and durations up to 20 seconds. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

OpenAI Sora 2 Pro Text-to-Video

OpenAI Sora 2 Pro is a state-of-the-art text-to-video model that generates high-quality videos with realistic physics, synchronized audio, and strong steerability. Create cinematic videos from text prompts with multiple resolution and duration options.

Model Highlights

  • Physics-aware motion with realistic contact, inertia, and momentum
  • Temporal consistency with stable identities and clean frame transitions
  • Synchronized audio with lip-sync alignment and ambient sounds
  • High-frequency detail preserving fine textures
  • Complex scene reasoning with multiple subjects and depth handling
  • Cinematic camera movements without warping artifacts
  • Wide stylistic range from photoreal to anime and 3D
  • Strong steerability responding to prompt edits and control settings

Parameters

  • prompt (required): Text description of the scene, style, camera, and audio cues
  • size (optional): Output resolution
    • 7201280 or 1280720 (720p) - default
    • 10241792 or 17921024 (1024p)
    • 10801920 or 19201080 (1080p)
  • duration (optional): Video length in seconds (4, 8, 12, 16, or 20 seconds, default: 4)

Use Cases

  • Cinematic video production from text descriptions
  • Marketing and promotional video content
  • Social media video creation
  • Concept visualization and storyboarding
  • Creative storytelling with synchronized audio
  • Product demonstrations and explainer videos

Pricing

Pricing is per second based on output resolution:

SizeOutput ResolutionPrice per Second
720pPortrait: 720x1280, Landscape: 1280x720$0.30
1024pPortrait: 1024x1792, Landscape: 1792x1024$0.50
1080pPortrait: 1080x1920, Landscape: 1920x1080$0.70

Examples

ResolutionDurationTotal Cost
720p4s$1.20
720p8s$2.40
720p20s$6.00
1024p4s$2.00
1024p20s$10.00
1080p4s$2.80
1080p20s$14.00

Billing Rules

  • Pricing scales linearly with duration
  • Duration options: 4, 8, 12, 16, or 20 seconds

How to Use

  1. Write your prompt describing scene, style, camera, and audio cues
  2. Select output resolution (720p, 1024p, or 1080p)
  3. Choose duration (4, 8, 12, 16, or 20 seconds)
  4. Submit via REST API endpoint
  5. Preview and download your generated video

API Integration

Simple REST API with text-to-video generation. The model processes your text prompt and generates high-quality video with synchronized audio, realistic physics, and cinematic visuals.

Notes

  • Prompt is the only required field
  • Default resolution is 1280*720 (720p landscape)
  • Default duration is 4 seconds
  • Higher resolutions increase generation cost
  • Longer durations scale linearly with cost
  • Follow content guidelines for appropriate use

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/openai/sora-2-pro/text-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "size": "1280*720",
    "duration": 4
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
promptstringYes-The positive prompt for the generation.
sizestringNo1280*720720*1280, 1280*720, 1024*1792, 1792*1024, 1920*1080, 1080*1920The size of the generated media in pixels (width*height).
durationintegerNo44, 8, 12, 16, 20The duration of the generated video in seconds.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Request Parameters

ParameterTypeRequiredDefaultDescription
idstringYes-Task ID

Result Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
dataobjectThe prediction data object containing all details
data.idstringUnique identifier for the prediction, the ID of the prediction to get
data.modelstringModel ID used for the prediction
data.outputsstringArray of URLs to the generated content (empty when status is not completed).
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds
© 2025 WaveSpeedAI. All rights reserved.