Skip to main content

One post tagged with "Claude Code"

View all tags

Using GLM-5.2 / GLM-4.7-Flash Models with Claude Code

The Claude Code CLI (claude) isn't limited to Anthropic's own models. By using the Anthropic Messages API-compatible endpoint (the "Anthropic Skin") provided by OpenRouter, you can run any model as the backend. This time I tried a setup that assigns z-ai/glm-5.2 as the main conversation model and z-ai/glm-4.7-flash for lightweight background tasks such as title generation, and I'm summarizing how to do it here.

Reference: https://openrouter.ai/docs/cookbook/coding-agents/claude-code-integration

Prerequisites

  • $env:OPENROUTER_API_KEY must already be set to your OpenRouter API key.
  • Even if you're logged into Claude Code with a claude.ai account, authentication via environment variables takes priority when present. In that case a warning ⚠ claude.ai connectors are disabled ... appears, but it doesn't affect operation.

Required Environment Variables

VariableValueNotes
ANTHROPIC_BASE_URLhttps://openrouter.ai/apiOpenRouter's Anthropic-compatible endpoint
ANTHROPIC_AUTH_TOKENYour OpenRouter API keySent as Authorization: Bearer
ANTHROPIC_API_KEY"" (explicitly empty string)If unset or non-empty, authentication methods conflict and cause an error
ANTHROPIC_MODELz-ai/glm-5.2The main conversation model. Specify the OpenRouter model ID as-is
ANTHROPIC_SMALL_FAST_MODELz-ai/glm-4.7-flashSub-model for lightweight background tasks such as title generation

Distinguishing Main and Sub Models

ANTHROPIC_MODEL only controls the main conversation model. The lightweight, fast sub-model that Claude Code uses internally for things like title generation and auto-mode detection is specified via ANTHROPIC_SMALL_FAST_MODEL. Setting this variable alone reliably switches the model used for lightweight tasks.

Setup Methods

Configure the following in the repository's .claude/settings.local.json.

{
"env": {
"ANTHROPIC_BASE_URL": "https://openrouter.ai/api",
"ANTHROPIC_API_KEY": "",
"ANTHROPIC_MODEL": "z-ai/glm-5.2",
"ANTHROPIC_SMALL_FAST_MODEL": "z-ai/glm-4.7-flash"
}
}

Don't write ANTHROPIC_AUTH_TOKEN (a secret) into the file — supply it from the shell side via OPENROUTER_API_KEY instead. Add the following once to your PowerShell profile ($PROFILE) and it will be set automatically every time you open a shell afterward.

$env:ANTHROPIC_AUTH_TOKEN = $env:OPENROUTER_API_KEY

For bash/zsh, add this to ~/.bashrc or ~/.zshrc:

export ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY"

With this in place, running claude -p "..." inside this directory automatically loads the settings from .claude/settings.local.json, and both the main and sub models are used via OpenRouter. Running Claude Code in other projects is unaffected.

Method B: Switch temporarily for a specific command only

If you don't want to modify your shell profile, specify the environment variables inline when running the command.

PowerShell:

$env:ANTHROPIC_BASE_URL = "https://openrouter.ai/api"
$env:ANTHROPIC_AUTH_TOKEN = $env:OPENROUTER_API_KEY
$env:ANTHROPIC_API_KEY = ""
$env:ANTHROPIC_MODEL = "z-ai/glm-5.2"
$env:ANTHROPIC_SMALL_FAST_MODEL = "z-ai/glm-4.7-flash"
claude -p "1+1は?"

bash:

ANTHROPIC_BASE_URL="https://openrouter.ai/api" \
ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY" \
ANTHROPIC_API_KEY="" \
ANTHROPIC_MODEL="z-ai/glm-5.2" \
ANTHROPIC_SMALL_FAST_MODEL="z-ai/glm-4.7-flash" \
claude -p "1+1は?"

Persisting the above variables as Windows user environment variables makes OpenRouter/GLM the default in every terminal and project. However, this also affects operations in other projects where you intend to use regular Claude (Anthropic), so Method A/B is recommended unless you actually want to change the global default unintentionally.

Verification

I confirmed with the following command that z-ai/glm-5.2 is recorded in the response and in the billing log (modelUsage).

claude -p "Reply with exactly: OK-GLM-TEST" --output-format json
{
"result": "OK-GLM-TEST",
"modelUsage": {
"z-ai/glm-5.2": { "inputTokens": 3272, "outputTokens": 8, "costUSD": 0.028336 }
}
}

Also, running a slightly longer prompt with --debug all confirmed that dispatching to both the main model and the sub model actually occurs.

[API:timing] dispatching to firstParty model=z-ai/glm-4.7-flash
[API:timing] dispatching to firstParty model=z-ai/glm-5.2

For a single prompt, the lightweight task correctly goes to z-ai/glm-4.7-flash and the main response correctly goes to z-ai/glm-5.2 — the models are being used properly.

Known Caveat: Request Proliferation in Interactive Sessions

In an interactive claude session (whether opened from VS Code or launched directly in a TTY), even light input resulted in 13 API calls being recorded within the same minute in OpenRouter's Activity log. Of those, 12 were small calls to the sub model (GLM 4.7 Flash), and only the last one was the main response from the main model (GLM 5.2). Total cost stayed around $0.0094.

On the other hand, I confirmed that a single run via claude -p (non-interactive, print mode) barely causes this proliferation. If you want to keep costs down, using -p is safer. If you regularly use interactive sessions, it's a good idea to set ANTHROPIC_SMALL_FAST_MODEL to route requests to a cheap model while monitoring usage on OpenRouter's Activity dashboard.

Troubleshooting

  • The warning ⚠ claude.ai connectors are disabled because ANTHROPIC_API_KEY or another auth source is set... can be ignored (it only disables claude.ai's connector feature; the API calls themselves still succeed).
  • Leaving ANTHROPIC_API_KEY unset instead of an empty string can make the authentication method ambiguous and cause an error. Always explicitly set it to "".
  • To switch to a different model, just change ANTHROPIC_MODEL to the model ID on OpenRouter (e.g. anthropic/claude-opus-4-8, openai/gpt-5, etc.).
  • Tool search optimization (ENABLE_TOOL_SEARCH) is disabled by default on non-Anthropic hosts. Add ENABLE_TOOL_SEARCH=true if you need it.