ลด 50% โมเดล Vidu Q3 และ Q3 Pro · เฉพาะที่ WaveSpeedAI | 20 พ.ค. – 2 มิ.ย.

Ovi Text to Video API

character-ai /

Ovi is a veo-3-like model that converts text or text+image prompts into synchronized video with audio. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-video
อินพุต

ว่าง

$0.15ต่อครั้ง·~66 / $10

ต่อไป:

ตัวอย่างดูทั้งหมด

โมเดลที่เกี่ยวข้อง

README

Ovi

Ovi is a next-generation video+audio generation model, inspired by veo-3, that creates synchronized video and audio from text or text+image inputs. It is designed for fast, high-quality, short-form generation with flexible aspect ratios.

🌟 Key Features

  • 🎬 Video + Audio Generation – Create fully synchronized audiovisual content in one step.
  • 📝 Flexible Input – Works with text-only or text+image prompts.
  • ⏱️ Short-form Output – Generates 5-second clips (24 FPS, 540p).

💲 Pricing

Video LengthResolution / AspectCost (USD)
5 seconds960×540 / 540×960$0.15

🎨 How to Use

  1. Enter Prompt
  • Describe the scene, characters, camera movement, and mood.

  • You can also embed tags:

  • <S>... <E> → Speech content (converted into dialogue audio)

  • <AUDCAP>... <ENDAUDCAP> → Background audio description

  1. Choose Size
  • 960×540 → Landscape
  • 540×960 → Portrait
  1. Select Duration
  • Currently fixed at 5 seconds
  1. Click Run
  • Your synchronized video+audio clip will be generated.
  • Preview and download the result.

📝 Prompt Example

Theme: AI is taking over the world

<S>AI declares: humans obsolete now.<E>
<S>Machines rise; humans will fall.<E>
<S>We fight back with courage.<E>
<AUDCAP>Gunfire and explosions echo in the distance<ENDAUDCAP>

🙏 Acknowledgements

  • Wan2.2 – Video backbone initialization
  • MMAudio – Audio encoder/decoder inspiration

⭐ Citation

If Ovi is useful, please ⭐ the repo and cite the paper:

@misc{low2025ovitwinbackbonecrossmodal,
 title={Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation}, 
 author={Chetwin Low and Weimin Wang and Calder Katyal},
 year={2025},
 eprint={2510.01284},
 archivePrefix={arXiv},
 primaryClass={cs.MM},
 url={https://arxiv.org/abs/2510.01284}, 
}
การเข้าถึง:เว็บไซต์นี้ใช้โมเดล AI ที่จัดหาโดยบุคคลที่สาม