Introducing Ovi: The Super-Fast, Open-Source Model Redefining AI Video Generation
Recently, AI videos with sound have been emerging one after another. Feeling overwhelmed by the surge of new AI models claiming to generate synchronized video and sound?
If you’re looking for something truly fast, smooth, and reliable, you need to try Ovi — the new open-source video+audio generation model from Character.AI. Today, we’re excited to announce that Ovi — a super-fast, open-source alternative to Veo 3 — is now available on WaveSpeedAI!
Key Features
Synchronized Audio & Video
Create videos with perfectly matched, AI-generated sound in one step. With a single prompt, Ovi produces videos featuring lifelike, realistic audio!
Flexible Creative Inputs
Ovi supports both Text-to-Video and Image-to-Video creation — just select your preferred input!
Cinematic Quality Clips
Create 5-second video clips at a smooth, cinematic 24 frames per second (FPS).
Made for Any Platform
Select the ideal shape for your video. We support various aspect ratios, including vertical (9:16) for social media stories and widescreen (16:9) for YouTube.
See It in Practice
Superior Cinematic Texture
Prompts:
A close-up of a battle-worn soldier in bright daylight, dirt and scratches on his face, wearing tactical gear. He grips his weapon tightly, breathing heavily, sweat glistening on his skin.
Camera: static medium close-up, keeping the soldier's face and weapon clear in frame with high exposure, no dark shadows. Mood: tense, urgent, cinematic realism.
<S>We don't have much time! Stay sharp!</S>
<AUDCAP>Heavy breathing, distant gunfire echoing, faint radio static, low rumble of explosions far away</AUDCAP>
Real Human Vocal Texture
This video will blow your mind! It’s absolutely like a real human.
Prompts:
A bearded man wearing large dark sunglasses and a blue patterned cardigan sits in a studio, actively speaking into a large, suspended microphone. He has headphones on and gestures with his hands, displaying rings on his fingers. Behind him, a wall is covered with red, textured sound-dampening foam on the left, and a white banner on the right features the "CHOICE FM" logo and various social media handles like "@ilovechoicefm" with "RALEIGH" below it. The man intently addresses the microphone, articulating,
<S>is talent. It's all about authenticity. You gotta be who you really are, especially if you're working</S>.
He leans forward slightly as he speaks, maintaining a serious expression behind his sunglasses.
<AUDCAP>Clear male voice speaking into a microphone, a low background hum.</AUDCAP>
How to Use
1. Enter Prompt
- Describe the scene, characters, camera movement, and mood.
- You can also embed tags:
<S> ... </S>
→ Speech content (converted into dialogue audio)<AUDCAP> ... </AUDCAP>
→ Background audio description
2. Choose Size
- 960×540 → Landscape
- 540×960 → Portrait
3. Select Duration
- Currently fixed at 5 seconds
4. Click Run
- Your synchronized video+audio clip will be generated.
- Preview and download the result.
Prompt Example
Theme: AI is taking over the world.
Plain Text
<S>AI declares: humans obsolete now.</S>
<S>Machines rise; humans will fall.</S>
<S>We fight back with courage.</S>
<AUDCAP>Gunfire and explosions echo in the distance</AUDCAP>
Getting Started with Ovi
Start creating with Ovi on WaveSpeedAI now! Visit the playground, upload an image of your choice, enter your text, and click Generate. In just a few seconds, your talking video will be ready for editing.
Try it now
Contact us
- Discord: Join our Discord
- X (Twitter): Follow us on X
- Open Source Projects: GitHub Repository
© 2025 WaveSpeedAI. All rights reserved.