Seedance 2.0 15% छूट | Video Generator में बनाएँ →
chatglm
z-ai/glm-5.2

z-ai/glm-5.2

प्रकाशन तिथि: 2026-06-17

1,048,576 context · $1.40/M input tokens · $4.40/M output tokens

GLM 5.2 is Z.ai’s most advanced reasoning model, built for long-context, agentic, and engineering-intensive workloads. With support for a 1M-token context window and configurable High/XHigh reasoning modes, it delivers state-of-the-art performance in coding, tool use, and complex task execution.From requirements gathering and architecture design to implementation, testing, and multi-platform deployment, GLM 5.2 can maintain project-level context and consistently follow engineering best practices throughout the entire software development lifecycle.

Pricing

Pay-per-use

No upfront costs, pay only for what you use

Input$1.40 / M Tokens
Output$4.40 / M Tokens
Cache Read$0.26 / M Tokens

Try the model

z-ai/glm-5.2
Online
chatglm
Hi! I am a helpful AI assistant. What can I do for you?

API Usage

Use the following code examples to integrate with our API:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://llm.wavespeed.ai/v1"
)

response = client.chat.completions.create(
    model="z-ai/glm-5.2",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(response.choices[0].message.content)

Model Introduction

Z.ai: GLM 5.2

GLM 5.2 is Z.ai’s latest large-scale reasoning model, designed for long-context understanding, advanced coding, and complex agent workflows. With support for a 1M-token context window and configurable reasoning levels, it can maintain project-scale context across extended interactions, making it well-suited for software engineering, research, automation, and multi-step problem solving.

The model supports both High and XHigh reasoning modes, with XHigh enabling its maximum reasoning capability. GLM 5.2 excels at code generation, tool use, structured outputs, and long-horizon task execution, allowing developers to build sophisticated AI agents and automation systems that operate reliably over large amounts of context.

This model is available through the WaveSpeed AI OpenAI-compatible API and can be integrated into existing applications with minimal changes.


Why Choose GLM 5.2

  • Massive 1M-token context window for large documents, repositories, and long-running workflows
  • Strong reasoning performance for coding, planning, and complex multi-step tasks
  • Optimized for agentic applications with function calling and tool-use support
  • Structured output generation for JSON-based workflows and schema-constrained responses
  • Flexible reasoning controls for balancing speed, cost, and reasoning depth
  • Competitive pricing for large-context production workloads

Key Features

  • Context Window: 1,048,576 tokens
  • Max Input: 786,432 tokens
  • Max Output: 262,144 tokens
  • Architecture: Text → Text
  • Function Calling: Supported
  • Structured Outputs: Supported
  • Reasoning Controls: Supported
  • Vision: Not listed
  • Audio Input: Not listed
  • Image Generation: Not listed

Specifications

SpecificationValue
Providerchatglm
Model TypeChat Completions
ArchitectureText → Text
Context Window1,048,576 tokens
Max Input786,432 tokens
Max Output262,144 tokens
InputText
OutputText
Function CallingSupported
Structured OutputsSupported

API Integration

Base URL

https://llm.wavespeed.ai/v1

Endpoint

POST /chat/completions

Model ID

z-ai/glm-5.2

Common Use Cases

  • AI coding assistants
  • Software engineering agents
  • Large-scale codebase analysis
  • Research and document intelligence
  • Workflow automation
  • Multi-agent systems
  • Structured data extraction
  • Long-context reasoning applications

Notes

  • Model ID: z-ai/glm-5.2
  • Provider: chatglm

Info

Providerchatglm
Typellm

Supported Functionality

Input
Text
Output
Text
Context1,048,576
Max Output262,144
Vision-
Function Calling✓ Supported

API Access Guide

Base URLhttps://llm.wavespeed.ai/v1
API Endpointchat/completions
Model IDz-ai/glm-5.2

GLM 5.2 API

z-ai/glm-5.2

GLM 5.2 is Z.ai’s most advanced reasoning model, built for long-context, agentic, and engineering-intensive workloads. With support for a 1M-token context window and configurable High/XHigh reasoning modes, it delivers state-of-the-art performance in coding, tool use, and complex task execution.From requirements gathering and architecture design to implementation, testing, and multi-platform deployment, GLM 5.2 can maintain project-level context and consistently follow engineering best practices throughout the entire software development lifecycle.

Input

$1.4 /M

Output

$4.4 /M

Context

1049K

Max Output

262K

Tool Use

Supported

Try GLM 5.2 on WaveSpeedAI

Access GLM 5.2 through our unified API — OpenAI-compatible, no cold starts, transparent pricing.

Frequently Asked Questions about GLM 5.2

How much does GLM 5.2 cost via the API?+

Pricing on WaveSpeedAI: $1.40 per million input tokens and $4.40 per million output tokens. Prompt caching and batch processing are billed separately and reduce effective cost on long, repetitive workloads.

What is the context window of GLM 5.2?+

GLM 5.2 supports up to 1049K tokens of context with up to 262K tokens of output per request.

Is GLM 5.2 OpenAI-compatible?+

Yes. WaveSpeedAI exposes GLM 5.2 through an OpenAI-compatible endpoint at https://llm.wavespeed.ai/v1. Point the official OpenAI SDK at this base URL with your WaveSpeedAI API key — no other code changes required.

How do I get started with GLM 5.2?+

Sign in to WaveSpeedAI, create an API key in Access Keys, then send a request to https://llm.wavespeed.ai/v1/chat/completions with model id set to the value shown above. New accounts receive free credits to evaluate GLM 5.2 before paying per token.

Related LLM APIs