Streaming API
Stream output chunks in real-time as the model generates them. Ideal for text-to-speech, music generation, and LLM responses.
Note: Streaming requests do not create a prediction record. Capture any data you need from the live stream.
Endpoint
POST https://api.wavespeed.ai/api/v3/{model_id}/streamAdd /stream to any supported model endpoint. The request payload is identical to the standard submission.
Request
curl -X POST "https://api.wavespeed.ai/api/v3/minimax/speech-02-turbo/stream" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Hello world! This is a test of the text-to-speech system.",
"voice_id": "Energetic_Girl",
"emotion": "happy",
"speed": 1,
"volume": 1
}'
Response
The response is an event stream (text/event-stream). Chunks are relayed directly from the upstream model without buffering.
- Stream closes when the provider finishes
- Terminal status payload (if any) is included in the stream
- Format mirrors the provider’s documentation (e.g., MiniMax uses JSON payloads per event)
Supported Models
| Provider | Models |
|---|---|
| MiniMax Speech | speech-02-hd, speech-02-turbo, speech-2.5-hd-preview, speech-2.5-turbo-preview, speech-2.6-hd, speech-2.6-turbo |
| MiniMax Music | music-02 |
| WaveSpeed LLM | any-llm, any-llm/vision |
Requests to /stream for unsupported models will return an error.
Notes
- Your client must support Server-Sent Events (SSE) or chunked responses
- Authentication, rate limits, and billing are the same as non-streaming endpoints
- Use
stream=Truein Python requests to handle chunked responses