z-ai/glm-5.2
Veröffentlichungsdatum: 2026-06-17
1,048,576 context · $1.40/M input tokens · $4.40/M output tokens
GLM 5.2 is Z.ai’s most advanced reasoning model, built for long-context, agentic, and engineering-intensive workloads. With support for a 1M-token context window and configurable High/XHigh reasoning modes, it delivers state-of-the-art performance in coding, tool use, and complex task execution.From requirements gathering and architecture design to implementation, testing, and multi-platform deployment, GLM 5.2 can maintain project-level context and consistently follow engineering best practices throughout the entire software development lifecycle.
Pay-per-Use
Keine Vorabkosten, zahlen Sie nur, was Sie nutzen
Verwenden Sie die folgenden Codebeispiele zur Integration mit unserer API:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://llm.wavespeed.ai/v1"
)
response = client.chat.completions.create(
model="z-ai/glm-5.2",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)GLM 5.2 is Z.ai’s latest large-scale reasoning model, designed for long-context understanding, advanced coding, and complex agent workflows. With support for a 1M-token context window and configurable reasoning levels, it can maintain project-scale context across extended interactions, making it well-suited for software engineering, research, automation, and multi-step problem solving.
The model supports both High and XHigh reasoning modes, with XHigh enabling its maximum reasoning capability. GLM 5.2 excels at code generation, tool use, structured outputs, and long-horizon task execution, allowing developers to build sophisticated AI agents and automation systems that operate reliably over large amounts of context.
This model is available through the WaveSpeed AI OpenAI-compatible API and can be integrated into existing applications with minimal changes.
| Specification | Value |
|---|---|
| Provider | chatglm |
| Model Type | Chat Completions |
| Architecture | Text → Text |
| Context Window | 1,048,576 tokens |
| Max Input | 786,432 tokens |
| Max Output | 262,144 tokens |
| Input | Text |
| Output | Text |
| Function Calling | Supported |
| Structured Outputs | Supported |
Base URL
https://llm.wavespeed.ai/v1
Endpoint
POST /chat/completions
Model ID
z-ai/glm-5.2
z-ai/glm-5.2chatglmz-ai/glm-5.2
GLM 5.2 is Z.ai’s most advanced reasoning model, built for long-context, agentic, and engineering-intensive workloads. With support for a 1M-token context window and configurable High/XHigh reasoning modes, it delivers state-of-the-art performance in coding, tool use, and complex task execution.From requirements gathering and architecture design to implementation, testing, and multi-platform deployment, GLM 5.2 can maintain project-level context and consistently follow engineering best practices throughout the entire software development lifecycle.
Eingabe
$1.4 /M
Ausgabe
$4.4 /M
Kontext
1049K
Max. Ausgabe
262K
Tool-Nutzung
Unterstützt
Zugriff auf GLM 5.2 über unsere einheitliche API — OpenAI-kompatibel, keine Kaltstarts, transparente Preise.
Preise auf WaveSpeedAI: $1.40 pro Million Input-Tokens und $4.40 pro Million Output-Tokens. Prompt-Caching und Batch-Verarbeitung werden separat berechnet und reduzieren die effektiven Kosten bei langen, sich wiederholenden Workloads.
GLM 5.2 unterstützt bis zu 1049K Kontext-Tokens und bis zu 262K Output-Tokens pro Anfrage.
Ja. WaveSpeedAI stellt GLM 5.2 über einen OpenAI-kompatiblen Endpunkt unter https://llm.wavespeed.ai/v1 bereit. Richten Sie das offizielle OpenAI SDK mit Ihrem WaveSpeedAI-API-Schlüssel auf diese Base-URL — keine weiteren Codeänderungen erforderlich.
Bei WaveSpeedAI anmelden, in Access Keys einen API-Schlüssel erstellen und eine Anfrage an https://llm.wavespeed.ai/v1/chat/completions mit der oben angezeigten Model-ID senden. Neue Konten erhalten kostenlose Credits, um GLM 5.2 zu testen.