Ai Vocal Remover

Playground

AI Vocal Remover separates vocals from instrumental in any audio track. Upload an audio file and choose to extract vocals or instrumental. Ready-to-use REST inference API, no coldstarts, affordable pricing.

Features

AI Vocal Remover

AI Vocal Remover separates vocals and instrumentals from any audio track with a single click. Choose what you want to extract — the clean vocal track or the backing instrumental — and get a studio-quality separated file in seconds.

Perfect for karaoke creation, music production, remixing, and content workflows.

Why Choose This?

Clean separation Advanced source separation technology isolates vocals and instrumentals with minimal bleed or artifacts.
Two extraction modes Extract exactly what you need — vocals only or instrumental only — with no extra processing steps.
Works on any audio Songs, podcasts, live recordings, mixed tracks — the model handles a wide range of audio sources.
Fast and affordable Per-second billing means you only pay for exactly what you process.

Parameters

Parameter	Required	Description
audio	Yes	Input audio file to process (URL or file upload).
mode	No	What to extract: vocals (default) or instrumental.

How to Use

Upload your audio — provide the track you want to separate via URL or drag-and-drop.
Select mode — choose vocals to extract the vocal track, or instrumental to extract the backing music.
Submit — download your separated audio file.

Pricing

$0.001 per second of input audio.

Best Use Cases

Karaoke creation — Strip out vocals to produce a clean instrumental backing track.
Music production & remixing — Isolate vocals or instrumentals for sampling, remixing, and mashups.
Content creation — Remove background music from recordings or extract a clean vocal for voiceover work.
Practice & education — Isolate individual elements of a track to study arrangement or performance.

Pro Tips

High-quality, well-mixed audio produces the cleanest separation results.
Tracks with a strong stereo mix and clear frequency separation between vocals and instruments work best.
Use vocals mode to get a clean a cappella track, and instrumental mode to get a karaoke-ready backing.

Notes

audio is the only required field; mode defaults to vocals if not specified.
Ensure audio URLs are publicly accessible if using a link rather than a direct upload.
Pricing is based on the duration of the input audio at $0.001 per second.
Please ensure your content complies with WaveSpeed AI’s usage policies.

Authentication

For authentication details, please refer to the Authentication Guide.

API Endpoints

Submit Task & Query Result


# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/ai-vocal-remover" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
    "mode": "vocals"
}'

# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"

Parameters

Task Submission Parameters

Request Parameters

Parameter	Type	Required	Default	Range	Description
audio	string	Yes	-	-	The URL of the input audio file.
mode	string	No	vocals	vocals, instrumental	Output type: vocals (extract vocals) or instrumental (extract accompaniment).

Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data.id	string	Unique identifier for the prediction, Task Id
data.model	string	Model ID used for the prediction
data.outputs	array	Array of URLs to the generated content (empty when status is not `completed`)
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.has_nsfw_contents	array	Array of boolean values indicating NSFW detection for each output
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

Result Request Parameters

Parameter	Type	Required	Default	Description
id	string	Yes	-	Task ID

Result Response Parameters

Parameter	Type	Description
code	integer	HTTP status code (e.g., 200 for success)
message	string	Status message (e.g., “success”)
data	object	The prediction data object containing all details
data.id	string	Unique identifier for the prediction, the ID of the prediction to get
data.model	string	Model ID used for the prediction
data.outputs	string	Array of URLs to the generated content (empty when status is not completed).
data.urls	object	Object containing related API endpoints
data.urls.get	string	URL to retrieve the prediction result
data.status	string	Status of the task: `created`, `processing`, `completed`, or `failed`
data.created_at	string	ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”)
data.error	string	Error message (empty if no error occurred)
data.timings	object	Object containing timing details
data.timings.inference	integer	Inference time in milliseconds

AI Twerk Any Llm