Z.ai

Access 13 Z.ai models through the OpenRouter unified API including GLM 5.2, GLM 5.1, and GLM 5V Turbo. Compare pricing, context windows, benchmarks, and capabilities between different Z.ai models.

Z.ai tokens processed on OpenRouter

Z.ai: GLM 5.2GLM 5.2
34.8B tokens
GLM-5.2 is Z.ai’s flagship model for the era of long-horizon tasks. With a truly usable 1M-token context window, it can handle project-level engineering context, execute long-running tasks more reliably, follow engineering standards more consistently, and complete the full development workflow from requirements to multi-platform deployment in a single task.
by z-aiJun 16, 20261.05M context$1.40/M input tokens$4.40/M output tokens

Z.ai

Access 13 Z.ai models through the OpenRouter unified API including GLM 5.2, GLM 5.1, and GLM 5V Turbo. Compare pricing, context windows, benchmarks, and capabilities between different Z.ai models.

Z.ai tokens processed on OpenRouter

Z.ai: GLM 5.2GLM 5.2
34.8B tokens
GLM-5.2 is Z.ai’s flagship model for the era of long-horizon tasks. With a truly usable 1M-token context window, it can handle project-level engineering context, execute long-running tasks more reliably, follow engineering standards more consistently, and complete the full development workflow from requirements to multi-platform deployment in a single task.
by z-aiJun 16, 20261.05M context$1.40/M input tokens$4.40/M output tokens

Z.ai: GLM 5.1GLM 5.1

722B tokens

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on a single task for more than 8 hours, autonomously planning, executing, and improving itself throughout the process, ultimately delivering complete, engineering-grade results.

by z-aiApr 7, 2026203K context$0.98/M input tokens$3.08/M output tokens

Z.ai: GLM 5V TurboGLM 5V Turbo

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video, and text inputs, excels at long-horizon planning, complex coding, and task execution, and works seamlessly with agents to complete the full loop of “perceive → plan → execute“.

by z-aiApr 1, 2026203K context

Z.ai: GLM 5 TurboGLM 5 Turbo

8.19B tokens

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply optimized for real-world agent workflows involving long execution chains, with improved complex instruction decomposition, tool use, scheduled and persistent execution, and overall stability across extended tasks.

by z-aiMar 15, 2026262K context$1.20/M input tokens$4/M output tokens

Z.ai: GLM 5GLM 5

146B tokens

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on large-scale programming tasks, rivaling leading closed-source models. With advanced agentic planning, deep backend reasoning, and iterative self-correction, GLM-5 moves beyond code generation to full-system construction and autonomous execution.

by z-aiFeb 11, 2026203K context$0.60/M input tokens$1.92/M output tokens

Z.ai: GLM 4.7 FlashGLM 4.7 Flash

35.1B tokens

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning, and tool collaboration, and has achieved leading performance among open-source models of the same size on several current public benchmark leaderboards.

by z-aiJan 19, 2026203K context$0.06/M input tokens$0.40/M output tokens

Z.ai: GLM 4.7GLM 4.7

70.2B tokens

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while delivering more natural conversational experiences and superior front-end aesthetics.

by z-aiDec 22, 2025203K context$0.40/M input tokens$1.75/M output tokens

Z.ai: GLM 4.6VGLM 4.6V

1.22B tokens

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It supports up to 128K tokens, processes complex page layouts and charts directly as visual inputs, and integrates native multimodal function calling to connect perception with downstream tool execution. The model also enables interleaved image-text generation and UI reconstruction workflows, including screenshot-to-HTML synthesis and iterative visual editing.

by z-aiDec 8, 2025131K context$0.30/M input tokens$0.90/M output tokens

Z.ai: GLM 4.6GLM 4.6

8.84B tokens

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex agentic tasks. Superior coding performance: The model achieves higher scores on code benchmarks and demonstrates better real-world performance in applications such as Claude Code、Cline、Roo Code and Kilo Code, including improvements in generating visually polished front-end pages. Advanced reasoning: GLM-4.6 shows a clear improvement in reasoning performance and supports tool use during inference, leading to stronger overall capability. More capable agents: GLM-4.6 exhibits stronger performance in tool using and search-based agents, and integrates more effectively within agent frameworks. Refined writing: Better aligns with human preferences in style and readability, and performs more naturally in role-playing scenarios.

by z-aiSep 30, 2025203K context$0.43/M input tokens$1.74/M output tokens

Z.ai: GLM 4.5VGLM 4.5V

60.1M tokens

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B activated parameters, it achieves state-of-the-art results in video understanding, image Q&A, OCR, and document parsing, with strong gains in front-end web coding, grounding, and spatial reasoning. It offers a hybrid inference mode: a "thinking mode" for deep reasoning and a "non-thinking mode" for fast responses. Reasoning behavior can be toggled via the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

by z-aiAug 11, 202566K context$0.60/M input tokens$1.80/M output tokens

Z.ai: GLM 4.5GLM 4.5

1.74B tokens

Going away June 19, 2026

GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-Experts (MoE) architecture and supports a context length of up to 128k tokens. GLM-4.5 delivers significantly enhanced capabilities in reasoning, code generation, and agent alignment. It supports a hybrid inference mode with two options, a "thinking mode" designed for complex reasoning and tool use, and a "non-thinking mode" optimized for instant responses. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

by z-aiJul 25, 2025131K context$0.60/M input tokens$2.20/M output tokens

Z.ai: GLM 4.5 AirGLM 4.5 Air

82B tokens

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter size. GLM-4.5-Air also supports hybrid inference modes, offering a "thinking mode" for advanced reasoning and tool use, and a "non-thinking mode" for real-time interaction. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

by z-aiJul 25, 2025131K context$0.13/M input tokens$0.85/M output tokens

Z.ai: GLM 4 32B GLM 4 32B

GLM 4 32B is a cost-effective foundation language model. It can efficiently perform complex tasks and has significantly enhanced capabilities in tool use, online search, and code-related intelligent tasks. It is made by the same lab behind the thudm models.

by z-aiJul 24, 2025128K context