Hunyuan Video Foley
Playground
Try it on WavespeedAI!Upload a video and provide a text description to generate realistic audio.
Features
HunyuanVideo-Foley
What is HunyuanVideo-Foley?
HunyuanVideo-Foley is Tencent Hunyuan’s video-to-audio model that synthesizes realistic Foley and ambient sound directly from video. It aligns on-screen actions and scene context to produce timing-accurate, high-quality audio tracks.
Why this?
Traditional audio generators struggle with generalization, semantic alignment, and clean quality. HunyuanVideo-Foley addresses these pain points head-on.
What it can do
- Multi-scene synchronization – High-quality audio aligned to complex, fast-cut visuals.
- Multi-modal balance – Blends visual cues with optional text prompts for intent-aware sound.
- 48 kHz hi-fi output – Professional clarity with low noise and artifacts.
- SOTA performance – Leading results in fidelity, sync, and semantic alignment benchmarks.
From short clips to cinematic cuts
Whether you’re polishing a social clip or finishing an animated short, HunyuanVideo-Foley can help with you.
Example (ASMR):
- Silent video description: close-up of hands slicing fresh kiwi on a wooden board; crisp macro textures; soft natural light.
- Text prompt: Generate realistic kiwi cutting and peeling sounds; gentle tapping; calm ASMR ambience.
Designed for
- Post & Studios – Fast Foley passes for animatics, rough cuts, and indie films.
- Creators & Social Teams – Auto-sound shorts/reels with consistent timing.
- Education & Prototyping – Demonstrate AV alignment or test sound design ideas quickly.
How to Use (HunyuanVideo-Foley)
-
Upload video (required) – Add the silent (or low-sound) clip you want to sound.
-
Write prompt (optional) – Briefly describe the mood or key sounds, e.g.
- Rainy street ambience, soft footsteps, distant cars.
- Kitchen ASMR: chopping vegetables, sizzling pan.
-
Set seed – use a fixed number to reproduce the same result; change it for variants.
-
Run – Click Run (the button shows the cost).
-
Review & iterate – If timing or tone isn’t right, tweak the prompt or seed and run again.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/hunyuan-video-foley" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"seed": -1
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| video | string | Yes | - | The video for generating the output. | |
| prompt | string | No | - | The positive prompt for the generation. | |
| seed | integer | No | -1 | -1 ~ 2147483647 | The random seed to use for the generation. -1 means a random seed will be used. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |