WaveSpeedAI APIWavespeed AIHunyuan Video Foley

Hunyuan Video Foley

Hunyuan Video Foley

Playground

Try it on WavespeedAI!

Upload a video and provide a text description to generate realistic audio.

Features

HunyuanVideo-Foley

What is HunyuanVideo-Foley?

HunyuanVideo-Foley is Tencent Hunyuan’s video-to-audio model that synthesizes realistic Foley and ambient sound directly from video. It aligns on-screen actions and scene context to produce timing-accurate, high-quality audio tracks.

Why this?

Traditional audio generators struggle with generalization, semantic alignment, and clean quality. HunyuanVideo-Foley addresses these pain points head-on.

What it can do

  • Multi-scene synchronization – High-quality audio aligned to complex, fast-cut visuals.
  • Multi-modal balance – Blends visual cues with optional text prompts for intent-aware sound.
  • 48 kHz hi-fi output – Professional clarity with low noise and artifacts.
  • SOTA performance – Leading results in fidelity, sync, and semantic alignment benchmarks.

From short clips to cinematic cuts

Whether you’re polishing a social clip or finishing an animated short, HunyuanVideo-Foley can help with you.

Example (ASMR):

  • Silent video description: close-up of hands slicing fresh kiwi on a wooden board; crisp macro textures; soft natural light.
  • Text prompt: Generate realistic kiwi cutting and peeling sounds; gentle tapping; calm ASMR ambience.

Designed for

  • Post & Studios – Fast Foley passes for animatics, rough cuts, and indie films.
  • Creators & Social Teams – Auto-sound shorts/reels with consistent timing.
  • Education & Prototyping – Demonstrate AV alignment or test sound design ideas quickly.

How to Use (HunyuanVideo-Foley)

  1. Upload video (required) – Add the silent (or low-sound) clip you want to sound.

  2. Write prompt (optional) – Briefly describe the mood or key sounds, e.g.

    • Rainy street ambience, soft footsteps, distant cars.
    • Kitchen ASMR: chopping vegetables, sizzling pan.
  3. Set seed – use a fixed number to reproduce the same result; change it for variants.

  4. Run – Click Run (the button shows the cost).

  5. Review & iterate – If timing or tone isn’t right, tweak the prompt or seed and run again.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/hunyuan-video-foley" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "seed": -1
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

ParameterTypeRequiredDefaultRangeDescription
videostringYes-The video for generating the output.
promptstringNo-The positive prompt for the generation.
seedintegerNo-1-1 ~ 2147483647The random seed to use for the generation. -1 means a random seed will be used.

Response Parameters

ParameterTypeDescription
codeintegerHTTP status code (e.g., 200 for success)
messagestringStatus message (e.g., “success”)
data.idstringUnique identifier for the prediction, Task Id
data.modelstringModel ID used for the prediction
data.outputsarrayArray of URLs to the generated content (empty when status is not completed)
data.urlsobjectObject containing related API endpoints
data.urls.getstringURL to retrieve the prediction result
data.has_nsfw_contentsarrayArray of boolean values indicating NSFW detection for each output
data.statusstringStatus of the task: created, processing, completed, or failed
data.created_atstringISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.errorstringError message (empty if no error occurred)
data.timingsobjectObject containing timing details
data.timings.inferenceintegerInference time in milliseconds

Result Request Parameters

© 2025 WaveSpeedAI. All rights reserved.