Openai Whisper Turbo
Playground
Try it on WavespeedAI!Instant, accurate speech-to-text powered by Whisper large-v3-turbo. Upload audio and receive multilingual transcripts with automatic language detection and punctuation.
Features
OpenAI Whisper Speech-to-Text
WaveSpeed’s Whisper deployment delivers production-ready speech recognition built on the large-v3-turbo checkpoint. Upload audio (MP3, WAV, FLAC) and receive accurate transcripts with automatic language detection.
Highlights
- Multilingual recognition across 50+ languages
- Automatic punctuation and casing
- Robust to background noise and accents
- Runs on GPU-accelerated infrastructure for fast turnaround
Quick Start
- Provide an audio file or HTTPS URL in the “audio” field.
- Submit the request via API or dashboard.
- Receive a JSON response containing the transcribed text.
Example output:
{
"outputs": {
"text": "Hello everyone, welcome to the show."
}
}
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/openai-whisper" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"language": "Auto",
"prompt": "",
"enable_sync_mode": true
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
Parameter | Type | Required | Default | Range | Description |
---|---|---|---|---|---|
audio | string | Yes | - | - | Audio file to transcribe. Provide an HTTPS URL or upload a file (MP3, WAV, FLAC up to 60 minutes). |
language | string | No | Auto | Afrikaans, Amharic, Arabic, Assamese, Azerbaijani, Bashkir, Belarusian, Bulgarian, Bengali, Tibetan, Breton, Bosnian, Catalan, Czech, Welsh, Danish, German, Greek, English, Spanish, Estonian, Basque, Persian, Finnish, Faroese, French, Galician, Gujarati, Hausa, Hawaiian, Hebrew, Hindi, Croatian, Haitian Creole, Hungarian, Armenian, Indonesian, Icelandic, Italian, Japanese, Javanese, Georgian, Kazakh, Khmer, Kannada, Korean, Latin, Luxembourgish, Lingala, Lao, Lithuanian, Latvian, Malagasy, Maori, Macedonian, Malayalam, Mongolian, Marathi, Malay, Maltese, Myanmar, Nepali, Dutch, Nynorsk, Norwegian, Occitan, Punjabi, Polish, Pashto, Portuguese, Romanian, Russian, Sanskrit, Sindhi, Sinhala, Slovak, Slovenian, Shona, Somali, Albanian, Serbian, Sundanese, Swedish, Swahili, Tamil, Telugu, Tajik, Thai, Turkmen, Tagalog, Turkish, Tatar, Ukrainian, Urdu, Uzbek, Vietnamese, Yiddish, Yoruba, Chinese, Cantonese, Auto | Language spoken in the audio. Set to 'auto' for automatic language detection (default). |
prompt | string | No | - | An optional text to provide as a prompt to guide the model's style or continue a previous audio segment. The prompt should be in the same language as the audio. | |
enable_sync_mode | boolean | No | true | - | If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API. |
Response Parameters
Parameter | Type | Description |
---|---|---|
code | integer | HTTP status code (e.g., 200 for success) |
message | string | Status message (e.g., “success”) |
data.id | string | Unique identifier for the prediction, Task Id |
data.model | string | Model ID used for the prediction |
data.outputs | array | Array of URLs to the generated content (empty when status is not completed ) |
data.urls | object | Object containing related API endpoints |
data.urls.get | string | URL to retrieve the prediction result |
data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
data.status | string | Status of the task: created , processing , completed , or failed |
data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
data.error | string | Error message (empty if no error occurred) |
data.timings | object | Object containing timing details |
data.timings.inference | integer | Inference time in milliseconds |