Gemini CLI Proxy

August 13, 2025 · View on GitHub

Wrap Gemini CLI as an OpenAI-compatible API service, allowing you to enjoy the free Gemini 2.5 Pro model through API!

English | 简体中文

✨ Features

🔌 OpenAI API Compatible: Implements /v1/chat/completions endpoint
🚀 Quick Setup: Zero-config run with uvx
⚡ High Performance: Built on FastAPI + asyncio with concurrent request support

If you want to know the principle of this tool, you can read my blog post (in Chinese).

🚀 Quick Start

Network Configuration

Since Gemini needs to access Google services, you may need to configure terminal proxy in certain network environments:

# Configure proxy (adjust according to your proxy server)
export https_proxy=http://127.0.0.1:7890
export http_proxy=http://127.0.0.1:7890  
export all_proxy=socks5://127.0.0.1:7890

Install Gemini CLI

Install Gemini CLI:

npm install -g @google/gemini-cli

After installation, use the gemini command to run Gemini CLI. You need to start it once first for login and initial configuration.

After configuration is complete, please confirm you can successfully run the following command:

gemini -p "Hello, Gemini"

Start Gemini CLI Proxy

Method 1: Direct startup

uvx gemini-cli-proxy

Method 2: Clone this repository and run:

uv run gemini-cli-proxy

Gemini CLI Proxy listens on port 8765 by default. You can customize the startup port with the --port parameter.

After startup, test the service with curl:

curl http://localhost:8765/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dummy-key" \
  -d '{
    "model": "gemini-2.5-pro",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Usage Examples

OpenAI Client

from openai import OpenAI

client = OpenAI(
    base_url='http://localhost:8765/v1',
    api_key='dummy-key'  # Any string works
)

response = client.chat.completions.create(
    model='gemini-2.5-pro',
    messages=[
        {'role': 'user', 'content': 'Hello!'}
    ],
)

print(response.choices[0].message.content)

Cherry Studio

Add Model Provider in Cherry Studio settings:

Provider Type: OpenAI
API Host: http://localhost:8765
API Key: Any string works
Model Name: gemini-2.5-pro or gemini-2.5-flash

Cherry Studio Config 1

Cherry Studio Config 2

⚙️ Configuration Options

View command line parameters:

gemini-cli-proxy --help

Available options:

--host: Server host address (default: 127.0.0.1)
--port: Server port (default: 8765)
--rate-limit: Max requests per minute (default: 60)
--max-concurrency: Max concurrent subprocesses (default: 4)
--timeout: Gemini CLI command timeout in seconds (default: 30.0)
--debug: Enable debug mode (enables debug logging and file watching)

❓ FAQ

Q: Why do requests keep timing out?

A: This is usually a network connectivity issue. Gemini needs to access Google services, which may require proxy configuration in certain regions:

# Configure proxy (adjust according to your proxy server)
export https_proxy=http://127.0.0.1:7890
export http_proxy=http://127.0.0.1:7890
export all_proxy=socks5://127.0.0.1:7890

# Then start the service
uvx gemini-cli-proxy

📄 License

MIT License

🤝 Contributing

Issues and Pull Requests are welcome!