
Add music, voiceovers, and sound effects to your videos with WaveSpeedAI’s audio-for-video tools.

MMaudio v2 produces synchronized audio from video or text inputs, ideal for adding soundtracks to videos when paired with video models. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Kling Video-to-Audio auto-generates or extracts matching sound effects and audio tracks from video using KlingAI's audio generation model. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

Kling Text-to-Audio turns text prompts into custom sound effects for videos, games, and multimedia using KlingAI's audio model. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

HunyuanVideo-Foley generates realistic Foley and ambient audio from an uploaded video using a text prompt to describe desired sounds. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

ACE-Step Prompt-to-Audio creates music from simple prompts, auto-generating genre tags and lyrics for quick song creation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Mirelo SFX V1.5 generates synchronized sound effects and audio for any video, producing synced SFX to enhance visuals. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

ElevenLabs Dubbing automatically translates and dubs video/audio content into different languages while preserving the original speakers' voices. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Mirelo SFX V1 Video-to-Audio generates synchronized sound effects from video input with text prompt guidance. Supports multiple sample generation and customizable duration. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
通过单一 REST API 运行 Audio for Video 系列中的任意模型。按生成计费 — 无订阅、无最低消费 — 在 99.9% 可用性的基础设施上提供行业领先的延迟。
每个 Audio for Video 模型都有按调用计价。价格在每个模型的页面上列出 — 不收取额外的平台费。
大多数 Audio for Video 图像模型在 2 秒内完成。视频和 3D 模型比自托管方案快数倍。
多区域故障转移和自动重试可确保您的生产流量保持在线 — 即使在供应商故障期间。
每个模型在其模型页面上都列有自己的按调用价格。我们按每次成功生成计费,没有订阅费或最低消费。
本系列中的图像模型通常在 2 秒内完成。视频和 3D 模型取决于时长和分辨率,但通常比自托管运行快数倍。
可以 — 每个账户在注册时获得 $1 的免费额度,足以在不使用信用卡的情况下试用大多数 Audio for Video 模型。
标准账户有充足的并发任务限制。企业版计划提供自定义 RPM、更高并发和专用容量 — 详情请联系销售。