Using GLM-5.2 / GLM-4.7-Flash Models with Claude Code
The Claude Code CLI (claude) isn't limited to Anthropic's own models. By using the
Anthropic Messages API-compatible endpoint (the "Anthropic Skin") provided by OpenRouter,
you can run any model as the backend. This time I tried a setup that assigns z-ai/glm-5.2
as the main conversation model and z-ai/glm-4.7-flash for lightweight background tasks
such as title generation, and I'm summarizing how to do it here.
Reference: https://openrouter.ai/docs/cookbook/coding-agents/claude-code-integration
Prerequisites
$env:OPENROUTER_API_KEYmust already be set to your OpenRouter API key.- Even if you're logged into Claude Code with a
claude.aiaccount, authentication via environment variables takes priority when present. In that case a warning⚠ claude.ai connectors are disabled ...appears, but it doesn't affect operation.
Required Environment Variables
| Variable | Value | Notes |
|---|---|---|
ANTHROPIC_BASE_URL | https://openrouter.ai/api | OpenRouter's Anthropic-compatible endpoint |
ANTHROPIC_AUTH_TOKEN | Your OpenRouter API key | Sent as Authorization: Bearer |
ANTHROPIC_API_KEY | "" (explicitly empty string) | If unset or non-empty, authentication methods conflict and cause an error |
ANTHROPIC_MODEL | z-ai/glm-5.2 | The main conversation model. Specify the OpenRouter model ID as-is |
ANTHROPIC_SMALL_FAST_MODEL | z-ai/glm-4.7-flash | Sub-model for lightweight background tasks such as title generation |
Distinguishing Main and Sub Models
ANTHROPIC_MODEL only controls the main conversation model. The lightweight, fast
sub-model that Claude Code uses internally for things like title generation and
auto-mode detection is specified via ANTHROPIC_SMALL_FAST_MODEL. Setting this variable
alone reliably switches the model used for lightweight tasks.
Setup Methods
Method A: Enable per project (recommended)
Configure the following in the repository's .claude/settings.local.json.
{
"env": {
"ANTHROPIC_BASE_URL": "https://openrouter.ai/api",
"ANTHROPIC_API_KEY": "",
"ANTHROPIC_MODEL": "z-ai/glm-5.2",
"ANTHROPIC_SMALL_FAST_MODEL": "z-ai/glm-4.7-flash"
}
}
Don't write ANTHROPIC_AUTH_TOKEN (a secret) into the file — supply it from the shell
side via OPENROUTER_API_KEY instead. Add the following once to your PowerShell profile
($PROFILE) and it will be set automatically every time you open a shell afterward.
$env:ANTHROPIC_AUTH_TOKEN = $env:OPENROUTER_API_KEY
For bash/zsh, add this to ~/.bashrc or ~/.zshrc:
export ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY"
With this in place, running claude -p "..." inside this directory automatically loads
the settings from .claude/settings.local.json, and both the main and sub models are used
via OpenRouter. Running Claude Code in other projects is unaffected.
Method B: Switch temporarily for a specific command only
If you don't want to modify your shell profile, specify the environment variables inline when running the command.
PowerShell:
$env:ANTHROPIC_BASE_URL = "https://openrouter.ai/api"
$env:ANTHROPIC_AUTH_TOKEN = $env:OPENROUTER_API_KEY
$env:ANTHROPIC_API_KEY = ""
$env:ANTHROPIC_MODEL = "z-ai/glm-5.2"
$env:ANTHROPIC_SMALL_FAST_MODEL = "z-ai/glm-4.7-flash"
claude -p "1+1は?"
bash:
ANTHROPIC_BASE_URL="https://openrouter.ai/api" \
ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY" \
ANTHROPIC_API_KEY="" \
ANTHROPIC_MODEL="z-ai/glm-5.2" \
ANTHROPIC_SMALL_FAST_MODEL="z-ai/glm-4.7-flash" \
claude -p "1+1は?"
Method C: Make it the machine-wide default (not recommended)
Persisting the above variables as Windows user environment variables makes OpenRouter/GLM the default in every terminal and project. However, this also affects operations in other projects where you intend to use regular Claude (Anthropic), so Method A/B is recommended unless you actually want to change the global default unintentionally.
Verification
I confirmed with the following command that z-ai/glm-5.2 is recorded in the response and
in the billing log (modelUsage).
claude -p "Reply with exactly: OK-GLM-TEST" --output-format json
{
"result": "OK-GLM-TEST",
"modelUsage": {
"z-ai/glm-5.2": { "inputTokens": 3272, "outputTokens": 8, "costUSD": 0.028336 }
}
}
Also, running a slightly longer prompt with --debug all confirmed that dispatching to
both the main model and the sub model actually occurs.
[API:timing] dispatching to firstParty model=z-ai/glm-4.7-flash
[API:timing] dispatching to firstParty model=z-ai/glm-5.2
For a single prompt, the lightweight task correctly goes to z-ai/glm-4.7-flash and the
main response correctly goes to z-ai/glm-5.2 — the models are being used properly.
Known Caveat: Request Proliferation in Interactive Sessions
In an interactive claude session (whether opened from VS Code or launched directly in a
TTY), even light input resulted in 13 API calls being recorded within the same minute in
OpenRouter's Activity log. Of those, 12 were small calls to the sub model (GLM 4.7 Flash),
and only the last one was the main response from the main model (GLM 5.2). Total cost stayed
around $0.0094.
On the other hand, I confirmed that a single run via claude -p (non-interactive, print
mode) barely causes this proliferation. If you want to keep costs down, using -p is safer.
If you regularly use interactive sessions, it's a good idea to set
ANTHROPIC_SMALL_FAST_MODEL to route requests to a cheap model while monitoring usage on
OpenRouter's Activity dashboard.
Troubleshooting
- The warning
⚠ claude.ai connectors are disabled because ANTHROPIC_API_KEY or another auth source is set...can be ignored (it only disables claude.ai's connector feature; the API calls themselves still succeed). - Leaving
ANTHROPIC_API_KEYunset instead of an empty string can make the authentication method ambiguous and cause an error. Always explicitly set it to"". - To switch to a different model, just change
ANTHROPIC_MODELto the model ID on OpenRouter (e.g.anthropic/claude-opus-4-8,openai/gpt-5, etc.). - Tool search optimization (
ENABLE_TOOL_SEARCH) is disabled by default on non-Anthropic hosts. AddENABLE_TOOL_SEARCH=trueif you need it.
Loading...