Home/Blog/Introducing Ovi: The Super-Fast, Open-Source Model Redefining AI Video Generation

Introducing Ovi: The Super-Fast, Open-Source Model Redefining AI Video Generation

Recently, AI videos with sound have been emerging one after another. Feeling overwhelmed by the surge of new AI models claiming to generate synchronized video and sound?

If you’re looking for something truly fast, smooth, and reliable, you need to try Ovi — the new open-source video+audio generation model from Character.AI. Today, we’re excited to announce that Ovi — a super-fast, open-source alternative to Veo 3 — is now available on WaveSpeedAI!

Key Features

Synchronized Audio & Video

Create videos with perfectly matched, AI-generated sound in one step. With a single prompt, Ovi produces videos featuring lifelike, realistic audio!

Flexible Creative Inputs

Ovi supports both Text-to-Video and Image-to-Video creation — just select your preferred input!

Cinematic Quality Clips

Create 5-second video clips at a smooth, cinematic 24 frames per second (FPS).

Made for Any Platform

Select the ideal shape for your video. We support various aspect ratios, including vertical (9:16) for social media stories and widescreen (16:9) for YouTube.

See It in Practice

Superior Cinematic Texture

Prompts:

A close-up of a battle-worn soldier in bright daylight, dirt and scratches on his face, wearing tactical gear. He grips his weapon tightly, breathing heavily, sweat glistening on his skin.

Camera: static medium close-up, keeping the soldier's face and weapon clear in frame with high exposure, no dark shadows. Mood: tense, urgent, cinematic realism.

<S>We don't have much time! Stay sharp!</S>

<AUDCAP>Heavy breathing, distant gunfire echoing, faint radio static, low rumble of explosions far away</AUDCAP>

Real Human Vocal Texture

This video will blow your mind! It’s absolutely like a real human.

Prompts:

A bearded man wearing large dark sunglasses and a blue patterned cardigan sits in a studio, actively speaking into a large, suspended microphone. He has headphones on and gestures with his hands, displaying rings on his fingers. Behind him, a wall is covered with red, textured sound-dampening foam on the left, and a white banner on the right features the "CHOICE FM" logo and various social media handles like "@ilovechoicefm" with "RALEIGH" below it. The man intently addresses the microphone, articulating, 

<S>is talent. It's all about authenticity. You gotta be who you really are, especially if you're working</S>. 

He leans forward slightly as he speaks, maintaining a serious expression behind his sunglasses. 

<AUDCAP>Clear male voice speaking into a microphone, a low background hum.</AUDCAP>

How to Use

1. Enter Prompt

  • Describe the scene, characters, camera movement, and mood.
  • You can also embed tags:
    • <S> ... </S> → Speech content (converted into dialogue audio)
    • <AUDCAP> ... </AUDCAP> → Background audio description

2. Choose Size

  • 960×540 → Landscape
  • 540×960 → Portrait

3. Select Duration

  • Currently fixed at 5 seconds

4. Click Run

  • Your synchronized video+audio clip will be generated.
  • Preview and download the result.

Prompt Example

Theme: AI is taking over the world.

Plain Text
<S>AI declares: humans obsolete now.</S>
<S>Machines rise; humans will fall.</S>
<S>We fight back with courage.</S>
<AUDCAP>Gunfire and explosions echo in the distance</AUDCAP>

Getting Started with Ovi

Start creating with Ovi on WaveSpeedAI now! Visit the playground, upload an image of your choice, enter your text, and click Generate. In just a few seconds, your talking video will be ready for editing.

Try it now

Contact us

© 2025 WaveSpeedAI. All rights reserved.