Home/Explore/wavespeed-ai/openai-whisper

speech-to-text

wavespeed-ai/openai-whisper

Instant, accurate speech-to-text powered by Whisper large-v3-turbo. Upload audio and receive multilingual transcripts with automatic language detection and punctuation.

Hint: You can drag and drop a file or click to upload

Enable to generate word-level timestamps for the transcription. Note: This may increase processing time.
If set to true, the function will wait for the result to be generated and uploaded before returning the response. It allows you to get the result directly in the response. This property is only available through the API.

Idle

{ "text": "Life is not measured by the moments we breathe, but by those that leave us breathless." }

Your request will cost $0.001 per run.

For $1 you can run this model approximately 1000 times.

ExamplesView all

README

OpenAI Whisper Speech-to-Text

WaveSpeed's Whisper deployment delivers production-ready speech recognition built on the large-v3 checkpoint. Upload audio (MP3, WAV, FLAC) and receive accurate transcripts with automatic language detection.

Highlights

  • Multilingual recognition across 50+ languages
  • Automatic punctuation and casing
  • Robust to background noise and accents
  • Runs on GPU-accelerated infrastructure for fast turnaround

Quick Start

  1. Provide an audio file or HTTPS URL in the "audio" field.
  2. Submit the request via API or dashboard.
  3. Receive a JSON response containing the transcribed text.

Example output:

{
  "outputs": {
    "text": "Hello everyone, welcome to the show."
  }
}