THUDM: GLM 4 9B (free)

thudm/glm-4-9b:free

Created Apr 25, 202532,000 context
$0/M input tokens$0/M output tokens

GLM-4-9B-0414 is a 9 billion parameter language model from the GLM-4 series developed by THUDM. Trained using the same reinforcement learning and alignment strategies as its larger 32B counterparts, GLM-4-9B-0414 achieves high performance relative to its size, making it suitable for resource-constrained deployments that still require robust language understanding and generation capabilities.

Providers for GLM 4 9B (free)

OpenRouter routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

Context
32K
Max Output
32K
Input
$0
Output
$0

Throughput

Latency

Apps using GLM 4 9B (free)

Top public apps this week using this model

1.
Cline
Autonomous coding agent right in your IDE
4.8Mtokens
2.
Roo Code
A whole dev team of AI agents in your editor
1.35Mtokens
3.
SillyTavern
LLM frontend for power users
925Ktokens
4.
OpenRouter: Chatroom
Chat with multiple LLMs at once
385Ktokens
5.
Open WebUI
Extensible, self-hosted AI interface
85Ktokens
6.
Chub AI
GenAI for everyone
83Ktokens
7.
Agnaistic
A "bring your own AI" chat service
56Ktokens
8.
WooCommerce AI Description Plugin
new
41Ktokens
9.
RisuAI
Browse characters, choose models, and chat
37Ktokens
10.
liteLLM
Open-source library to simplify LLM calls
15Ktokens
11.
New API
new
13Ktokens
12.
Cherry Studio
new
12Ktokens
13.
Apollo: Open Intelligence
new
7Ktokens
14.
Voices of the Court
new
6Ktokens
15.
PoeServer
new
6Ktokens
16.
Msty
new
4Ktokens
17.
Immersive Translation
new
4Ktokens
18.
Future Fiction Academy (Raptor Write)
new
3Ktokens
19.
Page Assist
new
2Ktokens
20.
Generador Letras AI
new
2Ktokens

Recent activity on GLM 4 9B (free)

Tokens processed per day

Apr 25Apr 27Apr 29May 1May 3May 5May 7May 9May 11May 1304M8M12M16M

Uptime stats for GLM 4 9B (free)

Uptime stats for GLM 4 9B (free) on the only provider

When an error occurs in an upstream provider, we can recover by routing to another healthy provider, if your request filters allow it.

Learn more about our load balancing and customization options.

Sample code and API for GLM 4 9B (free)

OpenRouter normalizes requests and responses across providers for you.

OpenRouter provides an OpenAI-compatible completion API to 300+ models & providers that you can call directly, or using the OpenAI SDK. Additionally, some third-party SDKs are available.

In the examples below, the OpenRouter-specific headers are optional. Setting them allows your app to appear on the OpenRouter leaderboards.

from openai import OpenAI

client = OpenAI(
  base_url="https://openrouter.ai/api/v1",
  api_key="<OPENROUTER_API_KEY>",
)

completion = client.chat.completions.create(
  extra_headers={
    "HTTP-Referer": "<YOUR_SITE_URL>", # Optional. Site URL for rankings on openrouter.ai.
    "X-Title": "<YOUR_SITE_NAME>", # Optional. Site title for rankings on openrouter.ai.
  },
  extra_body={},
  model="thudm/glm-4-9b:free",
  messages=[
    {
      "role": "user",
      "content": "What is the meaning of life?"
    }
  ]
)
print(completion.choices[0].message.content)

Using third-party SDKs

For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.

See the Request docs for all possible fields, and Parameters for explanations of specific sampling parameters.

More models from thudm

    GLM 4 9B (free) - API, Providers, Stats | OpenRouter