Kwaivgi Kling Video To Audio
Playground
Try it on WavespeedAI!Generate sound effects from video using KlingAI’s advanced audio generation model. Extract or generate matching audio tracks for your videos automatically.
Features
Kuaivgi — Kling Video-to-Audio
Kling Video-to-Audio adds a complete soundtrack to a silent video using two short prompts: one for sound effects (SFX) and one for background music (BGM). It generates synchronized foley, ambience, and score cues that match on-screen action. Great for trailers, shorts, product shots, and mood pieces.
Highlights
- Prompt-based SFX and BGM that follow scene energy and timing
- Optional ASMR mode for hyper-detailed, close-mic textures
- Works with cinematic, documentary, gameplay, and product footage
- Fast iteration: tweak prompts, re-render, and compare
Parameters
-
video (required) URL or upload of the silent clip to be sonified.
-
sound_effect_prompt Describe on-screen events and textures to hear. Example: “Thunderstorm, heavy rain, distant thunder rolls, glass rattling, wind gusts, ocean waves slamming rocks.”
-
bgm_prompt Describe musical mood, instrumentation, and pacing. Example: “Brooding orchestral score, low strings, sparse piano hits, slow build with sub-bass swells.”
-
asmr_mode (checkbox) Enhances micro-details and proximity effect for immersive listening (ear-tingles, crisp foley).
How to Use
- Upload or paste the video URL.
- Write a concise sound_effect_prompt for foley/ambience.
- Add a bgm_prompt for the musical bed.
- Toggle asmr_mode if you want ultra-detailed textures.
- Click Run and download the generated audio track aligned to your clip.
Prompting Tips
-
Be concrete: call out specific events, materials, and distances
“Leather jacket rustle, footsteps on wet concrete, elevator ding, neon hum.”
-
For BGM, specify tempo/structure.
-
Keep SFX and BGM prompts stylistically consistent to avoid clashes.
-
If dialogue is needed, add it in post—this model focuses on SFX and score.
Output
- An audio track designed to sync with the input video’s duration.
- Format and delivery follow platform defaults (download URL in the response).
Pricing
- Per-job pricing is $0.035
Notes
- Start with clean, final-cut footage; large edits after sound design will desync cues.
- Loudness is unmastered by design—normalize or master in your editor to your target LUFS.
- Ensure you have rights to the video content you upload and follow platform policies for generated audio.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-to-audio" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"video": "https://d1q70pf5vjeyhc.cloudfront.net/predictions/d3ef72f0c81049c0a9fa2473421aca69/1.mp4",
"asmr_mode": false
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| video | string | No | - | The video for generating the output. | |
| sound_effect_prompt | string | No | - | - | Text prompt for sound effect generation, maximum 200 characters |
| bgm_prompt | string | No | - | - | Text prompt for background music generation, maximum 200 characters |
| asmr_mode | boolean | No | false | - | Enable ASMR mode to enhance detailed sound effects, suitable for immersive content scenarios |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |