z-ai/glm-5.2
Fecha de lanzamiento: 2026-06-17
1,048,576 context · $1.40/M input tokens · $4.40/M output tokens
GLM 5.2 is Z.ai’s most advanced reasoning model, built for long-context, agentic, and engineering-intensive workloads. With support for a 1M-token context window and configurable High/XHigh reasoning modes, it delivers state-of-the-art performance in coding, tool use, and complex task execution.From requirements gathering and architecture design to implementation, testing, and multi-platform deployment, GLM 5.2 can maintain project-level context and consistently follow engineering best practices throughout the entire software development lifecycle.
Pago por uso
Sin costos iniciales, paga solo por lo que uses
Usa los siguientes ejemplos de código para integrar con nuestra API:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://llm.wavespeed.ai/v1"
)
response = client.chat.completions.create(
model="z-ai/glm-5.2",
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)GLM 5.2 is Z.ai’s latest large-scale reasoning model, designed for long-context understanding, advanced coding, and complex agent workflows. With support for a 1M-token context window and configurable reasoning levels, it can maintain project-scale context across extended interactions, making it well-suited for software engineering, research, automation, and multi-step problem solving.
The model supports both High and XHigh reasoning modes, with XHigh enabling its maximum reasoning capability. GLM 5.2 excels at code generation, tool use, structured outputs, and long-horizon task execution, allowing developers to build sophisticated AI agents and automation systems that operate reliably over large amounts of context.
This model is available through the WaveSpeed AI OpenAI-compatible API and can be integrated into existing applications with minimal changes.
| Specification | Value |
|---|---|
| Provider | chatglm |
| Model Type | Chat Completions |
| Architecture | Text → Text |
| Context Window | 1,048,576 tokens |
| Max Input | 786,432 tokens |
| Max Output | 262,144 tokens |
| Input | Text |
| Output | Text |
| Function Calling | Supported |
| Structured Outputs | Supported |
Base URL
https://llm.wavespeed.ai/v1
Endpoint
POST /chat/completions
Model ID
z-ai/glm-5.2
z-ai/glm-5.2chatglmz-ai/glm-5.2
GLM 5.2 is Z.ai’s most advanced reasoning model, built for long-context, agentic, and engineering-intensive workloads. With support for a 1M-token context window and configurable High/XHigh reasoning modes, it delivers state-of-the-art performance in coding, tool use, and complex task execution.From requirements gathering and architecture design to implementation, testing, and multi-platform deployment, GLM 5.2 can maintain project-level context and consistently follow engineering best practices throughout the entire software development lifecycle.
Entrada
$1.4 /M
Salida
$4.4 /M
Contexto
1049K
Salida máx.
262K
Uso de herramientas
Compatible
Accede a GLM 5.2 mediante nuestra API unificada — compatible con OpenAI, sin arranques en frío, precios transparentes.
Precios en WaveSpeedAI: $1.40 por millón de tokens de entrada y $4.40 por millón de tokens de salida. El prompt caching y el procesamiento por lotes se facturan por separado y reducen el coste efectivo en cargas largas y repetitivas.
GLM 5.2 admite hasta 1049K tokens de contexto y hasta 262K tokens de salida por solicitud.
Sí. WaveSpeedAI expone GLM 5.2 a través de un endpoint compatible con OpenAI en https://llm.wavespeed.ai/v1. Apunta el SDK oficial de OpenAI a esta base URL con tu clave API de WaveSpeedAI — sin más cambios de código.
Inicia sesión en WaveSpeedAI, crea una clave API en Access Keys y envía una solicitud a https://llm.wavespeed.ai/v1/chat/completions con el id de modelo mostrado arriba. Las cuentas nuevas reciben créditos gratuitos para evaluar GLM 5.2 antes de pagar por token.