qwen/qwen3-8b
40,960 context · $0.05/M input tokens · $0.40/M output tokens
Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math,...
Pagamento por uso
Sem custo inicial, pague apenas pelo que usar
Use os exemplos de código abaixo para integrar com nossa API:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://llm.wavespeed.ai/v1"
)
response = client.chat.completions.create(
model="qwen/qwen3-8b",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)Qwen3-8B is a dense 8
Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math, coding, and logical inference, and "non-thinking" mode for general conversation. The model is fine-tuned for instruction-following, agent integration, creative writing, and multilingual use across 100+ languages and dialects. It natively supports a 32K token context window and can extend to 131K tokens with YaRN scaling.
| Specification | Value |
|---|---|
| Provider | Qwen |
| Model Type | Large Language Model (LLM) |
| Architecture | N/A |
| Context Window | 32000 tokens |
| Max Output | 8192 tokens |
| Input | Text |
| Output | Text |
| Vision | Supported |
| Function Calling | Supported |
| Token Type | Cost per Million Tokens |
|---|---|
| Input | $0.0 |
| Output | $0.4 |
Base URL: https://llm.wavespeed.ai/v1 API Endpoint: chat/completions Model ID: qwen/qwen3-8b
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://llm.wavespeed.ai/v1"
)
response = client.chat.completions.create(
model="qwen/qwen3-8b",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
curl https://llm.wavespeed.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "qwen/qwen3-8b",
"messages": [{"role": "user", "content": "Hello!"}]
}'
qwen/qwen3-8b
Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math,...
Entrada
$0.05 /M
Saída
$0.4 /M
Contexto
41K
Saída máx.
8K
Uso de ferramentas
Suportado
Acesse Qwen3 8b através da nossa API unificada — compatível com OpenAI, sem inicializações a frio, preços transparentes.
Preços no WaveSpeedAI: $0.05 por milhão de tokens de entrada e $0.40 por milhão de tokens de saída. Prompt caching e batch processing são cobrados separadamente e reduzem o custo efetivo em cargas longas e repetitivas.
Qwen3 8b suporta até 41K tokens de contexto e até 8K tokens de saída por requisição.
Sim. O WaveSpeedAI expõe o Qwen3 8b através de um endpoint compatível com OpenAI em https://llm.wavespeed.ai/v1. Aponte o SDK oficial da OpenAI para esta base URL com sua chave API do WaveSpeedAI — sem outras alterações no código.
Entre no WaveSpeedAI, crie uma chave API em Access Keys, então envie uma requisição para https://llm.wavespeed.ai/v1/chat/completions com o model id mostrado acima. Contas novas recebem créditos grátis para avaliar o Qwen3 8b.