WaveSpeedAI APICharacter AICharacter AI Ovi Text To Video

Character Ai Ovi Text To Video

Character Ai Ovi Text To Video

Playground

Try it on WavespeedAI!

Ovi is a veo-3-like model that converts text or text+image prompts into synchronized video with audio. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Features

Ovi

Ovi is a next-generation video+audio generation model, inspired by veo-3, that creates synchronized video and audio from text or text+image inputs. It is designed for fast, high-quality, short-form generation with flexible aspect ratios.


🌟 Key Features

  • 🎬 Video + Audio Generation – Create fully synchronized audiovisual content in one step.
  • 📝 Flexible Input – Works with text-only or text+image prompts.
  • ⏱️ Short-form Output – Generates 5-second clips (24 FPS, 540p).

💲 Pricing

Video LengthResolution / AspectCost (USD)
5 seconds960×540 / 540×960$0.15

🎨 How to Use

  1. Enter Prompt

    • Describe the scene, characters, camera movement, and mood.

    • You can also embed tags:

      • <S> ... <E> → Speech content (converted into dialogue audio)
      • <AUDCAP> ... <ENDAUDCAP> → Background audio description
  2. Choose Size

    • 960×540 → Landscape
    • 540×960 → Portrait
  3. Select Duration

    • Currently fixed at 5 seconds
  4. Click Run

    • Your synchronized video+audio clip will be generated.
    • Preview and download the result.

📝 Prompt Example

Theme: AI is taking over the world

<S>AI declares: humans obsolete now.<E>
<S>Machines rise; humans will fall.<E>
<S>We fight back with courage.<E>
<AUDCAP>Gunfire and explosions echo in the distance<ENDAUDCAP>

🙏 Acknowledgements

  • Wan2.2 – Video backbone initialization
  • MMAudio – Audio encoder/decoder inspiration

⭐ Citation

If Ovi is useful, please ⭐ the repo and cite the paper:

@misc&#123;low2025ovitwinbackbonecrossmodal,
      title=&#123;Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation&#125;, 
      author=&#123;Chetwin Low and Weimin Wang and Calder Katyal&#125;,
      year=&#123;2025&#125;,
      eprint=&#123;2510.01284&#125;,
      archivePrefix=&#123;arXiv&#125;,
      primaryClass=&#123;cs.MM&#125;,
      url=&#123;https://arxiv.org/abs/2510.01284&#125;, 
&#125;

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/character-ai/ovi/text-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "size": "960*540",
    "seed": -1
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
promptstringYes-The positive prompt for the generation.
sizestringNo960*540960*540, 540*960The size of the generated media in pixels (width*height).
seedintegerNo-1-1 ~ 2147483647The random seed to use for the generation. -1 means a random seed will be used.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Request Parameters

ParameterTypeRequiredDefaultDescription
idstringYes-Task ID

Result Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
dataobjectThe prediction data object containing all details
data.idstringUnique identifier for the prediction, the ID of the prediction to get
data.modelstringModel ID used for the prediction
data.outputsobjectArray of URLs to the generated content (empty when status is not completed).
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds
© 2025 WaveSpeedAI. All rights reserved.