
video-to-audio
Idle
Your request will cost $0.05 per run.
For $1 you can run this model approximately 20 times.
HunyuanVideo-Foley is Tencent Hunyuan's video-to-audio model that synthesizes realistic Foley and ambient sound directly from video. It aligns on-screen actions and scene context to produce timing-accurate, high-quality audio tracks.
Traditional audio generators struggle with generalization, semantic alignment, and clean quality. HunyuanVideo-Foley addresses these pain points head-on.
Whether you’re polishing a social clip or finishing an animated short, HunyuanVideo-Foley can help with you.
Example (ASMR):
Upload video (required) – Add the silent (or low-sound) clip you want to sound.
Write prompt (optional) – Briefly describe the mood or key sounds, e.g.
Set seed – use a fixed number to reproduce the same result; change it for variants.
Run – Click Run (the button shows the cost).
Review & iterate – If timing or tone isn't right, tweak the prompt or seed and run again.