qwen/qwen3-vl-8b-instruct
131,072 context · $0.08/M input tokens · $0.50/M output tokens
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...
Pay-per-use
Nessun costo iniziale, paga solo per ciò che usi
Usa i seguenti esempi di codice per integrare la nostra API:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://llm.wavespeed.ai/v1"
)
response = client.chat.completions.create(
model="qwen/qwen3-vl-8b-instruct",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)**Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, **
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon temporal reasoning, DeepStack for fine-grained visual-text alignment, and text-timestamp alignment for precise event localization.
The model supports a native 256K-token context window, extensible to 1M tokens, and handles both static and dynamic media inputs for tasks like document parsing, visual question answering, spatial reasoning, and GUI control. It achieves text understanding comparable to leading LLMs while expanding OCR coverage to 32 languages and enhancing robustness under varied visual conditions.
| Specification | Value |
|---|---|
| Provider | Qwen |
| Model Type | Large Language Model (LLM) |
| Architecture | N/A |
| Context Window | 131072 tokens |
| Max Output | 32768 tokens |
| Input | Text |
| Output | Text |
| Vision | Supported |
| Function Calling | Supported |
| Token Type | Cost per Million Tokens |
|---|---|
| Input | $0.1 |
| Output | $0.5 |
Base URL: https://llm.wavespeed.ai/v1 API Endpoint: chat/completions Model ID: qwen/qwen3-vl-8b-instruct
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://llm.wavespeed.ai/v1"
)
response = client.chat.completions.create(
model="qwen/qwen3-vl-8b-instruct",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
curl https://llm.wavespeed.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "qwen/qwen3-vl-8b-instruct",
"messages": [{"role": "user", "content": "Hello!"}]
}'
qwen/qwen3-vl-8b-instruct
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...
Input
$0.08 /M
Output
$0.5 /M
Contesto
131K
Output max
33K
Vision
Supportato
Uso strumenti
Supportato
Accedi a Qwen3 Vl 8b Instruct tramite la nostra API unificata — compatibile con OpenAI, senza cold start, prezzi trasparenti.
Prezzi su WaveSpeedAI: $0.08 per milione di token in input e $0.50 per milione di token in output. Prompt caching e batch processing sono fatturati separatamente e riducono il costo effettivo su carichi lunghi e ripetitivi.
Qwen3 Vl 8b Instruct supporta fino a 131K token di contesto e fino a 33K token di output per richiesta.
Sì. WaveSpeedAI espone Qwen3 Vl 8b Instruct tramite un endpoint compatibile con OpenAI all'indirizzo https://llm.wavespeed.ai/v1. Punta l'SDK ufficiale di OpenAI a questa base URL con la tua API key WaveSpeedAI — senza altre modifiche al codice.
Accedi a WaveSpeedAI, crea una API key in Access Keys, poi invia una richiesta a https://llm.wavespeed.ai/v1/chat/completions con il model id mostrato sopra. I nuovi account ricevono crediti gratuiti per testare Qwen3 Vl 8b Instruct.