Kwaivgi Kling Video To Audio

Playground

Kling Video-to-Audio auto-generates or extracts matching sound effects and audio tracks from video using KlingAI’s audio generation model. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

Features

Kuaivgi — Kling Video-to-Audio

Kling Video-to-Audio adds a complete soundtrack to a silent video using two short prompts: one for sound effects (SFX) and one for background music (BGM). It generates synchronized foley, ambience, and score cues that match on-screen action. Great for trailers, shorts, product shots, and mood pieces.

Highlights

Prompt-based SFX and BGM that follow scene energy and timing
Optional ASMR mode for hyper-detailed, close-mic textures
Works with cinematic, documentary, gameplay, and product footage
Fast iteration: tweak prompts, re-render, and compare

Parameters

video (required) URL or upload of the silent clip to be sonified.
sound_effect_prompt Describe on-screen events and textures to hear. Example: “Thunderstorm, heavy rain, distant thunder rolls, glass rattling, wind gusts, ocean waves slamming rocks.”
bgm_prompt Describe musical mood, instrumentation, and pacing. Example: “Brooding orchestral score, low strings, sparse piano hits, slow build with sub-bass swells.”
asmr_mode (checkbox) Enhances micro-details and proximity effect for immersive listening (ear-tingles, crisp foley).

How to Use

Upload or paste the video URL.
Write a concise sound_effect_prompt for foley/ambience.
Add a bgm_prompt for the musical bed.
Toggle asmr_mode if you want ultra-detailed textures.
Click Run and download the generated audio track aligned to your clip.

Prompting Tips

Be concrete: call out specific events, materials, and distances

“Leather jacket rustle, footsteps on wet concrete, elevator ding, neon hum.”
For BGM, specify tempo/structure.
Keep SFX and BGM prompts stylistically consistent to avoid clashes.
If dialogue is needed, add it in post—this model focuses on SFX and score.

Output

An audio track designed to sync with the input video’s duration.
Format and delivery follow platform defaults (download URL in the response).

Pricing

Per-job pricing is $0.035

Notes

Start with clean, final-cut footage; large edits after sound design will desync cues.
Loudness is unmastered by design—normalize or master in your editor to your target LUFS.
Ensure you have rights to the video content you upload and follow platform policies for generated audio.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/kwaivgi/kling-video-to-audio" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "video": "https://d1q70pf5vjeyhc.cloudfront.net/predictions/d3ef72f0c81049c0a9fa2473421aca69/1.mp4",
    "asmr_mode": false
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
video	string	No		-	The video for generating the output.Please note that the duration cannot exceed 20s.
sound_effect_prompt	string	No	-	-	Text prompt for sound effect generation, maximum 200 characters
bgm_prompt	string	No	-	-	Text prompt for background music generation, maximum 200 characters
asmr_mode	boolean	No	false	-	Enable ASMR mode to enhance detailed sound effects, suitable for immersive content scenarios

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.has_nsfw_contents	array	Array of boolean values indicating NSFW detection for each output
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	string	Array of URLs to the generated content (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Kwaivgi Kling Video O1 Video Edit Fast Google Gemini 2.5 Flash Image Edit