qwen/qwen3-vl-8b-instruct
131,072 context · $0.08/M input tokens · $0.50/M output tokens
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...
Pay-per-Use
Keine Vorabkosten, zahlen Sie nur, was Sie nutzen
Verwenden Sie die folgenden Codebeispiele zur Integration mit unserer API:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://llm.wavespeed.ai/v1"
)
response = client.chat.completions.create(
model="qwen/qwen3-vl-8b-instruct",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)**Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, **
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon temporal reasoning, DeepStack for fine-grained visual-text alignment, and text-timestamp alignment for precise event localization.
The model supports a native 256K-token context window, extensible to 1M tokens, and handles both static and dynamic media inputs for tasks like document parsing, visual question answering, spatial reasoning, and GUI control. It achieves text understanding comparable to leading LLMs while expanding OCR coverage to 32 languages and enhancing robustness under varied visual conditions.
| Specification | Value |
|---|---|
| Provider | Qwen |
| Model Type | Large Language Model (LLM) |
| Architecture | N/A |
| Context Window | 131072 tokens |
| Max Output | 32768 tokens |
| Input | Text |
| Output | Text |
| Vision | Supported |
| Function Calling | Supported |
| Token Type | Cost per Million Tokens |
|---|---|
| Input | $0.1 |
| Output | $0.5 |
Base URL: https://llm.wavespeed.ai/v1 API Endpoint: chat/completions Model ID: qwen/qwen3-vl-8b-instruct
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://llm.wavespeed.ai/v1"
)
response = client.chat.completions.create(
model="qwen/qwen3-vl-8b-instruct",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
curl https://llm.wavespeed.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "qwen/qwen3-vl-8b-instruct",
"messages": [{"role": "user", "content": "Hello!"}]
}'
qwen/qwen3-vl-8b-instruct
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...
Eingabe
$0.08 /M
Ausgabe
$0.5 /M
Kontext
131K
Max. Ausgabe
33K
Vision
Unterstützt
Tool-Nutzung
Unterstützt
Zugriff auf Qwen3 Vl 8b Instruct über unsere einheitliche API — OpenAI-kompatibel, keine Kaltstarts, transparente Preise.
Preise auf WaveSpeedAI: $0.08 pro Million Input-Tokens und $0.50 pro Million Output-Tokens. Prompt-Caching und Batch-Verarbeitung werden separat berechnet und reduzieren die effektiven Kosten bei langen, sich wiederholenden Workloads.
Qwen3 Vl 8b Instruct unterstützt bis zu 131K Kontext-Tokens und bis zu 33K Output-Tokens pro Anfrage.
Ja. WaveSpeedAI stellt Qwen3 Vl 8b Instruct über einen OpenAI-kompatiblen Endpunkt unter https://llm.wavespeed.ai/v1 bereit. Richten Sie das offizielle OpenAI SDK mit Ihrem WaveSpeedAI-API-Schlüssel auf diese Base-URL — keine weiteren Codeänderungen erforderlich.
Bei WaveSpeedAI anmelden, in Access Keys einen API-Schlüssel erstellen und eine Anfrage an https://llm.wavespeed.ai/v1/chat/completions mit der oben angezeigten Model-ID senden. Neue Konten erhalten kostenlose Credits, um Qwen3 Vl 8b Instruct zu testen.