qwen/qwen3.7-max
1,000,000 context · $2.50/M input tokens · $7.50/M output tokens
Qwen3.7-Max is Alibaba’s flagship model in the Qwen3.7 series, built for agent-centric text workflows. It is optimized for coding, debugging, office automation, productivity tasks, tool use, and long-horizon autonomous execution. With a 1M-token context window and up to 64K output tokens, it is well suited for large documents, repository-scale coding, multi-step planning, structured generation, and workflows that require sustained reasoning across hundreds or thousands of steps.
Pay-per-use
No upfront costs, pay only for what you use
Use the following code examples to integrate with our API:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://llm.wavespeed.ai/v1"
)
response = client.chat.completions.create(
model="qwen/qwen3.7-max",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)Qwen3.7-Max is Alibaba’s flagship model in the Qwen3.7 series, designed for agent-centric text workflows. It is optimized for coding, debugging, office automation, productivity tasks, tool use, and long-horizon autonomous execution.
| Specification | Value |
|---|---|
| Provider | alibaba |
| Model Type | Chat Completions model |
| Architecture | text->text |
| Context Window | 1,000,000 tokens |
| Max Input | 934,464 tokens |
| Max Output | 65,536 tokens |
| Input | Text |
| Output | Text |
| Vision | Not listed |
| Function Calling | Supported |
| Structured Outputs | Supported |
| Thinking Mode | Supported |
| Primary Use Cases | Coding, office automation, productivity workflows, long-horizon agents, tool use |
| Release | May 2026 |
| Token Type | Cost |
|---|---|
| Input | $2.50 per million tokens |
| Output | $7.50 per million tokens |
| Cache Write | $3.125 per million tokens |
Base URL: https://llm.wavespeed.ai/v1
API Endpoint: chat/completions
Model ID: qwen/qwen3.7-max
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://llm.wavespeed.ai/v1"
)
response = client.chat.completions.create(
model="qwen/qwen3.7-max",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
curl https://llm.wavespeed.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "qwen/qwen3.7-max",
"messages": [{"role": "user", "content": "Hello!"}]
}'
qwen/qwen3.7-max
Qwen3.7-Max is Alibaba’s flagship model in the Qwen3.7 series, built for agent-centric text workflows. It is optimized for coding, debugging, office automation, productivity tasks, tool use, and long-horizon autonomous execution. With a 1M-token context window and up to 64K output tokens, it is well suited for large documents, repository-scale coding, multi-step planning, structured generation, and workflows that require sustained reasoning across hundreds or thousands of steps.
Input
$2.5 /M
Output
$7.5 /M
Context
1000K
Max Output
66K
Tool Use
Supported
Access Qwen3.7 Max through our unified API — OpenAI-compatible, no cold starts, transparent pricing.
Pricing on WaveSpeedAI: $2.50 per million input tokens and $7.50 per million output tokens. Prompt caching and batch processing are billed separately and reduce effective cost on long, repetitive workloads.
Qwen3.7 Max supports up to 1000K tokens of context with up to 66K tokens of output per request.
Yes. WaveSpeedAI exposes Qwen3.7 Max through an OpenAI-compatible endpoint at https://llm.wavespeed.ai/v1. Point the official OpenAI SDK at this base URL with your WaveSpeedAI API key — no other code changes required.
Sign in to WaveSpeedAI, create an API key in Access Keys, then send a request to https://llm.wavespeed.ai/v1/chat/completions with model id set to the value shown above. New accounts receive free credits to evaluate Qwen3.7 Max before paying per token.