How to Use Streaming

Get real-time progress updates during generation — see results as they happen.

What is Streaming?

Streaming lets you receive data incrementally as it’s generated, instead of waiting for the entire task to complete.

Example: When generating a long text response, streaming shows each word as it’s produced, rather than waiting for the full response.

Why Use Streaming?

Method	How It Works	Best For
Polling	Check status every few seconds	Background tasks
Webhook	Server notifies you when done	Production backends
Streaming	Receive data as it’s generated	Real-time UI updates

Benefits of streaming:

Show progress bars during generation
Display partial results immediately
Better user experience for long tasks
Essential for chat/conversational interfaces

Supported Models

Not all models support streaming. Check the model’s README to see if streaming is available.

Models that commonly support streaming:

LLM models (text generation)
Some video generation models (progress updates)

Enabling Streaming

Add stream: true to your request:

curl --location --request POST 'https://api.wavespeed.ai/api/v3/wavespeed-ai/model' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ${WAVESPEED_API_KEY}' \
--data-raw '{
  "prompt": "Your prompt",
  "stream": true
}'

Response Format

Streaming responses use Server-Sent Events (SSE) — a standard format where the server sends a series of data: lines, each containing a JSON object:

data: {"type": "progress", "percentage": 25}

data: {"type": "progress", "percentage": 50}

data: {"type": "progress", "percentage": 75}

data: {"type": "complete", "outputs": ["https://..."]}

Event Types

Type	Description
`progress`	Progress update with percentage
`log`	Log message from the model
`output`	Partial output (for LLMs)
`complete`	Task completed successfully
`error`	Task failed

Example: JavaScript

const response = await fetch('https://api.wavespeed.ai/api/v3/model', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${WAVESPEED_API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    prompt: 'Your prompt',
    stream: true
  })
});
 
const reader = response.body.getReader();
const decoder = new TextDecoder();
 
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
 
  const text = decoder.decode(value);
  const lines = text.split('\n');
 
  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = JSON.parse(line.slice(6));
      console.log(data);
    }
  }
}

Example: Python

import requests
import json
import os
 
WAVESPEED_API_KEY = os.environ.get('WAVESPEED_API_KEY')
 
response = requests.post(
    'https://api.wavespeed.ai/api/v3/model',
    headers={
        'Authorization': f'Bearer {WAVESPEED_API_KEY}',
        'Content-Type': 'application/json'
    },
    json={
        'prompt': 'Your prompt',
        'stream': True
    },
    stream=True
)
 
for line in response.iter_lines():
    if line:
        line = line.decode('utf-8')
        if line.startswith('data: '):
            data = json.loads(line[6:])
            print(data)

When to Use Streaming

Use Case	Recommendation
Progress bars	Use streaming
Real-time chat	Use streaming
Background processing	Use webhooks
Simple requests	Use polling

Next Steps

How to Use Webhooks LoRA Training & Usage