Gemini ↔︎ OpenAI Proxy

July 3, 2025 · View on GitHub

This program is a Gemini CLI wrapper that can serve Google Gemini 2.5 Pro (or Flash) through an OpenAI-compatible API. Plug-and-play with clients that already speak OpenAI like SillyTavern, llama.cpp, LangChain, the VS Code Cline extension, etc.


Features

FeatureNotes
/v1/chat/completionsNon-stream & stream (SSE)Works with curl, ST, LangChain…
Vision supportimage_url → Gemini inlineData
Function / Tool callingOpenAI “functions” → Gemini Tool Registry
Reasoning / chain-of-thoughtSends enable_thoughts:true, streams <think> chunksST shows grey bubbles
1 M-token contextProxy auto-lifts Gemini CLI’s default 200 k cap
CORSEnabled (*) by defaultReady for browser apps

Quick start

With npm

git clone https://github.com/Brioch/gemini-openai-proxy
cd gemini-openai-proxy
npm i
npm start # launch (runs on port 11434 by default)

With Docker

Alternatively, you can use the provided Dockerfile to build a Docker image.

docker build --tag "gemini-openai-proxy" .
docker run -p 11434:80 -e GEMINI_API_KEY gemini-openai-proxy

Optional env vars

PORT=11434

# can be any of 'oauth-personal', 'gemini-api-key', 'vertex-ai'. Use oauth-personal for free access to Gemini 2.5 Pro by logging in to a Google account.
AUTH_TYPE='gemini-api-key' 

# API key is only used with AUTH_TYPE='gemini-api-key'
GEMINI_API_KEY=

# Use 'gemini-2.5-flash' or 'gemini-2.5-pro'. Leave empty to let CLI choose its default model.
MODEL=

Minimal curl test

curl -X POST http://localhost:11434/v1/chat/completions \
     -H "Content-Type: application/json" \
     -d '{
       "model": "gemini-2.5-pro-latest",
       "messages":[{"role":"user","content":"Hello Gemini!"}]
     }'

SillyTavern settings

Chat completion API Base URL http://127.0.0.1:11434/v1

License

MIT – free for personal & commercial use. Forked from https://huggingface.co/engineofperplexity/gemini-openai-proxy