I Built “llmglot” an LLM API Translation Proxy That Connects Claude Code, Codex, and Gemini

June 28, 2026

In AI development, people are increasingly struggling more with API specification differences than with the models themselves.

Claude Code uses the Anthropic Messages API. OpenAI-based tools use the Chat Completions API or the Responses API. Gemini has its own proprietary API, and local LLM environments such as Ollama and LM Studio provide OpenAI-compatible APIs.

Even when the model you want exists, you may not be able to connect to it because the API specifications differ. You need to rewrite SDKs, implement translation layers, and separate configurations for each environment.

llmglot is the tool that solves these problems.

GitHub: https://github.com/Himeyama/llmglot

What is llmglot?

llmglot is a proxy server that converts multiple LLM APIs back and forth.

It accepts Anthropic Messages API, OpenAI Responses API, OpenAI Chat Completions API, Gemini API, and more, then appropriately converts and forwards them to the upstream API.

In other words, even if the API specification required by the client does not match the API specification of the model you actually want to use, you can still use it.

Supported Scope

On the client side, it supports:

Claude Code
Anthropic SDK
OpenAI SDK
Responses API clients
Gemini SDK

On the upstream side, it can connect to:

Ollama
LM Studio
OpenAI
Gemini
OpenRouter
Azure OpenAI
vLLM

For example, it enables connections such as:

Claude Code → Ollama
OpenAI SDK → Gemini
Responses API → Chat Completions API

Connecting Claude Code to a Local LLM

One use case for llmglot is connecting Claude Code to a local LLM.

For example, if you are using Ollama, start it with:

CHAT_BASE_URL=http://localhost:11434/v1 llmglot

Then simply change Claude Code’s connection target to llmglot, and you can use a local LLM from Claude Code.

Being able to try a code agent without using an expensive API is extremely appealing.

(Note: Ollama actually supports the Messages API, so a proxy is not really necessary.)

Supports the Responses API Too

Recently, more and more tools have been using the Responses API.

llmglot implements the /v1/responses endpoint and supports not only HTTP but also WebSocket.

That makes it easy to connect with Responses API-based clients such as the Codex CLI.

Useful Logging Features

llmglot includes a log display feature.

The information you can check includes:

Model name
Provider
Input token count
Output token count
Cache usage
Estimated cost
Generation speed

In environments where multiple LLMs are used, the benefit of centrally managing usage information is significant.

Differences from LiteLLM

At this point, the question is: how is this different from LiteLLM?

Both are software that sits between LLMs, but the direction they aim for is very different.

LiteLLM’s Approach

LiteLLM unifies many LLM providers into OpenAI format.

In other words:

App
    ↓
LiteLLM
    ↓
Each LLM provider

You could say it is a mechanism for application developers to use a unified API.

llmglot’s Approach

llmglot, on the other hand, is structured like this:

Claude Code
Codex CLI
Gemini SDK
       ↓
     llmglot
       ↓
Each LLM provider

Its goal is to translate the client-side API specification.

Comparing with Claude Code

Claude Code uses the Anthropic Messages API.

LiteLLM is basically centered on OpenAI-compatible APIs, so connecting Claude Code directly is difficult.

llmglot, on the other hand, accepts the Anthropic Messages API directly.

That means the following configuration is possible:

Claude Code
    ↓
llmglot
    ↓
Ollama

This is a very major feature of llmglot.

Which Should You Choose?

LiteLLM is a good fit for:

People developing AI applications
People who want to use the OpenAI SDK as a unified interface
People who want load balancing or fallback

llmglot is a good fit for:

People who want to run Claude Code with a different model
People who want to use local LLMs from existing clients
People who want to convert the Responses API to another API
People who do not want to rewrite SDKs

Conclusion

In the LLM world, API differences are becoming a bigger problem than the models themselves.

You want to use Claude Code, but with Ollama. You want to use the OpenAI SDK, but with Gemini. Demands like these will likely continue to grow.

llmglot sits in the middle and absorbs the differences in API specifications.

Claude Code → Ollama
OpenAI SDK → Gemini
Responses API → Chat Completions API
Gemini API → OpenAI-compatible API

If LiteLLM is a “unified API for developers,” llmglot could be described as a “client-compatible proxy.”

What is llmglot?​

Supported Scope​

Connecting Claude Code to a Local LLM​

Supports the Responses API Too​

Useful Logging Features​

Differences from LiteLLM

LiteLLM’s Approach​

llmglot’s Approach​

Comparing with Claude Code​

Which Should You Choose?​

Conclusion​