GPT-Load

May 29, 2026 · View on GitHub

English | 中文 | 日本語

Release Go Version License

A high-performance, enterprise-grade AI API transparent proxy service designed specifically for enterprises and developers who need to integrate multiple AI services. Built with Go, featuring intelligent key management, load balancing, and comprehensive monitoring capabilities, designed for high-concurrency production environments.

For detailed documentation, please visit Official Documentation

tbphp%2Fgpt-load | Trendshift Featured|HelloGitHub

Features

  • Transparent Proxy: Complete preservation of native API formats, supporting OpenAI, Google Gemini, and Anthropic Claude among other formats
  • Intelligent Key Management: High-performance key pool with group-based management, automatic rotation, and failure recovery
  • Load Balancing: Weighted load balancing across multiple upstream endpoints to enhance service availability
  • Smart Failure Handling: Automatic key blacklist management and recovery mechanisms to ensure service continuity
  • Dynamic Configuration: System settings and group configurations support hot-reload without requiring restarts
  • Enterprise Architecture: Distributed leader-follower deployment supporting horizontal scaling and high availability
  • Modern Management: Vue 3-based web management interface that is intuitive and user-friendly
  • Comprehensive Monitoring: Real-time statistics, health checks, and detailed request logging
  • High-Performance Design: Zero-copy streaming, connection pool reuse, and atomic operations
  • Production Ready: Graceful shutdown, error recovery, and comprehensive security mechanisms
  • Dual Authentication: Separate authentication for management and proxy, with proxy authentication supporting global and group-level keys

Supported AI Services

GPT-Load serves as a transparent proxy service, completely preserving the native API formats of various AI service providers:

  • OpenAI Format: Official OpenAI API, Azure OpenAI, and other OpenAI-compatible services
  • Google Gemini Format: Native APIs for Gemini Pro, Gemini Pro Vision, and other models
  • Anthropic Claude Format: Claude series models, supporting high-quality conversations and text generation

Quick Start

System Requirements

  • Go 1.24+ (for source builds)
  • Docker (for containerized deployment)
  • MySQL, PostgreSQL, or SQLite (for database storage)
  • Redis (for caching and distributed coordination, optional)

Method 1: Docker Quick Start

docker run -d --name gpt-load \
    -p 3001:3001 \
    -e AUTH_KEY=your-secure-key-here \
    -v "$(pwd)/data":/app/data \
    ghcr.io/tbphp/gpt-load:latest

Please change your-secure-key-here to a strong password (never use the default value), then you can log in to the management interface: http://localhost:3001

Installation Commands:

# Create Directory
mkdir -p gpt-load && cd gpt-load

# Download configuration files
wget https://raw.githubusercontent.com/tbphp/gpt-load/refs/heads/main/docker-compose.yml
wget -O .env https://raw.githubusercontent.com/tbphp/gpt-load/refs/heads/main/.env.example

# Edit the .env file and change AUTH_KEY to a strong password. Never use default or simple keys like sk-123456.

# Start services
docker compose up -d

Before deployment, you must change the default admin key (AUTH_KEY). A recommended format is: sk-prod-[32-character random string].

The default installation uses the SQLite version, which is suitable for lightweight, single-instance applications.

If you need to install MySQL, PostgreSQL, and Redis, please uncomment the required services in the docker-compose.yml file, configure the corresponding environment variables, and restart.

Other Commands:

# Check service status
docker compose ps

# View logs
docker compose logs -f

# Restart Service
docker compose down && docker compose up -d

# Update to latest version
docker compose pull && docker compose down && docker compose up -d

After deployment:

Use your modified AUTH_KEY to log in to the management interface.

Method 3: Source Build

Source build requires a locally installed database (SQLite, MySQL, or PostgreSQL) and Redis (optional).

# Clone and build
git clone https://github.com/tbphp/gpt-load.git
cd gpt-load
go mod tidy

# Create configuration
cp .env.example .env

# Edit the .env file and change AUTH_KEY to a strong password. Never use default or simple keys like sk-123456.
# Modify DATABASE_DSN and REDIS_DSN configurations in .env
# REDIS_DSN is optional; if not configured, memory storage will be enabled

# Run
make run

After deployment:

Use your modified AUTH_KEY to log in to the management interface.

Method 4: Cluster Deployment

Cluster deployment requires all nodes to connect to the same MySQL (or PostgreSQL) and Redis, with Redis being mandatory. It's recommended to use unified distributed MySQL and Redis clusters.

Deployment Requirements:

  • All nodes must configure identical AUTH_KEY, DATABASE_DSN, REDIS_DSN
  • Leader-follower architecture where follower nodes must configure environment variable: IS_SLAVE=true

For details, please refer to Cluster Deployment Documentation

Configuration System

Configuration Architecture Overview

GPT-Load adopts a dual-layer configuration architecture:

1. Static Configuration (Environment Variables)

  • Characteristics: Read at application startup, immutable during runtime, requires application restart to take effect
  • Purpose: Infrastructure configuration such as database connections, server ports, authentication keys, etc.
  • Management: Set via .env files or system environment variables

2. Dynamic Configuration (Hot-Reload)

  • System Settings: Stored in database, providing unified behavioral standards for the entire application
  • Group Configuration: Behavior parameters customized for specific groups, can override system settings
  • Configuration Priority: Group Configuration > System Settings > Environment Configuration
  • Characteristics: Supports hot-reload, takes effect immediately after modification without application restart
Static Configuration (Environment Variables)

Server Configuration:

SettingEnvironment VariableDefaultDescription
Service PortPORT3001HTTP server listening port
Service AddressHOST0.0.0.0HTTP server binding address
Read TimeoutSERVER_READ_TIMEOUT60HTTP server read timeout (seconds)
Write TimeoutSERVER_WRITE_TIMEOUT600HTTP server write timeout (seconds)
Idle TimeoutSERVER_IDLE_TIMEOUT120HTTP connection idle timeout (seconds)
Graceful Shutdown TimeoutSERVER_GRACEFUL_SHUTDOWN_TIMEOUT10Service graceful shutdown wait time (seconds)
Follower ModeIS_SLAVEfalseFollower node identifier for cluster deployment
TimezoneTZAsia/ShanghaiSpecify timezone

Security Configuration:

SettingEnvironment VariableDefaultDescription
Admin KeyAUTH_KEY-Access authentication key for the management end, please change it to a strong password
Encryption KeyENCRYPTION_KEY-Encrypts API keys at rest. Supports any string or leave empty to disable encryption. See Data Encryption Migration

Database Configuration:

SettingEnvironment VariableDefaultDescription
Database ConnectionDATABASE_DSN./data/gpt-load.dbDatabase connection string (DSN) or file path
Redis ConnectionREDIS_DSN-Redis connection string, uses memory storage when empty

Performance & CORS Configuration:

SettingEnvironment VariableDefaultDescription
Max Concurrent RequestsMAX_CONCURRENT_REQUESTS100Maximum concurrent requests allowed by system
Enable CORSENABLE_CORSfalseWhether to enable Cross-Origin Resource Sharing
Allowed OriginsALLOWED_ORIGINS-Allowed origins, comma-separated
Allowed MethodsALLOWED_METHODSGET,POST,PUT,DELETE,OPTIONSAllowed HTTP methods
Allowed HeadersALLOWED_HEADERS*Allowed request headers, comma-separated
Allow CredentialsALLOW_CREDENTIALSfalseWhether to allow sending credentials

Logging Configuration:

SettingEnvironment VariableDefaultDescription
Log LevelLOG_LEVELinfoLog level: debug, info, warn, error
Log FormatLOG_FORMATtextLog format: text, json
Enable File LoggingLOG_ENABLE_FILEfalseWhether to enable file log output
Log File PathLOG_FILE_PATH./data/logs/app.logLog file storage path

Proxy Configuration:

GPT-Load automatically reads proxy settings from environment variables to make requests to upstream AI providers.

SettingEnvironment VariableDefaultDescription
HTTP ProxyHTTP_PROXY-Proxy server address for HTTP requests
HTTPS ProxyHTTPS_PROXY-Proxy server address for HTTPS requests
No ProxyNO_PROXY-Comma-separated list of hosts or domains to bypass the proxy

Supported Proxy Protocol Formats:

  • HTTP: http://user:pass@host:port
  • HTTPS: https://user:pass@host:port
  • SOCKS5: socks5://user:pass@host:port
Dynamic Configuration (Hot-Reload)

Basic Settings:

SettingField NameDefaultGroup OverrideDescription
Project URLapp_urlhttp://localhost:3001Project base URL
Global Proxy Keysproxy_keysInitial value from AUTH_KEYGlobally effective proxy keys, comma-separated
Log Retention Daysrequest_log_retention_days7Request log retention days, 0 for no cleanup
Log Write Intervalrequest_log_write_interval_minutes1Log write to database cycle (minutes)
Enable Request Body Loggingenable_request_body_loggingfalseWhether to log complete request body content in request logs

Request Settings:

SettingField NameDefaultGroup OverrideDescription
Request Timeoutrequest_timeout600Forward request complete lifecycle timeout (seconds)
Connection Timeoutconnect_timeout15Timeout for establishing connection with upstream service (seconds)
Idle Connection Timeoutidle_conn_timeout120HTTP client idle connection timeout (seconds)
Response Header Timeoutresponse_header_timeout600Timeout for waiting upstream response headers (seconds)
Max Idle Connectionsmax_idle_conns100Connection pool maximum total idle connections
Max Idle Connections Per Hostmax_idle_conns_per_host50Maximum idle connections per upstream host
Proxy URLproxy_url-HTTP/HTTPS proxy for forwarding requests, uses environment if empty

Key Configuration:

SettingField NameDefaultGroup OverrideDescription
Max Retriesmax_retries3Maximum retry count using different keys for single request
Blacklist Thresholdblacklist_threshold3After how many cumulative failures does the key get blacklisted
Key Validation Intervalkey_validation_interval_minutes60Background scheduled key validation cycle (minutes)
Key Validation Concurrencykey_validation_concurrency10Concurrency for background validation of invalid keys
Key Validation Timeoutkey_validation_timeout_seconds20API request timeout for validating individual keys in background (seconds)

Data Encryption Migration

GPT-Load supports encrypted storage of API keys. You can enable, disable, or change the encryption key at any time.

View Data Encryption Migration Details

Migration Scenarios

  • Enable Encryption: Encrypt plaintext data for storage - Use --to <new-key>
  • Disable Encryption: Decrypt encrypted data to plaintext - Use --from <current-key>
  • Change Encryption Key: Replace the encryption key - Use --from <current-key> --to <new-key>

Operation Steps

Docker Compose Deployment

# 1. Update the image (ensure using the latest version)
docker compose pull

# 2. Stop the service
docker compose down

# 3. Backup the database (strongly recommended)
# Before migration, you must manually backup the database or export your keys to avoid key loss due to operations or exceptions.

# 4. Execute migration command
# Enable encryption (your-32-char-secret-key is your key, recommend using 32+ character random string)
docker compose run --rm gpt-load migrate-keys --to "your-32-char-secret-key"

# Disable encryption
docker compose run --rm gpt-load migrate-keys --from "your-current-key"

# Change encryption key
docker compose run --rm gpt-load migrate-keys --from "old-key" --to "new-32-char-secret-key"

# 5. Update configuration file
# Edit .env file, set ENCRYPTION_KEY to match the --to parameter
# If disabling encryption, remove ENCRYPTION_KEY or set it to empty
vim .env
# Add or modify: ENCRYPTION_KEY=your-32-char-secret-key

# 6. Restart the service
docker compose up -d

Source Build Deployment

# 1. Stop the service
# Stop the running service process (Ctrl+C or kill process)

# 2. Backup the database (strongly recommended)
# Before migration, you must manually backup the database or export your keys to avoid key loss due to operations or exceptions.

# 3. Execute migration command
# Enable encryption
make migrate-keys ARGS="--to your-32-char-secret-key"

# Disable encryption
make migrate-keys ARGS="--from your-current-key"

# Change encryption key
make migrate-keys ARGS="--from old-key --to new-32-char-secret-key"

# 4. Update configuration file
# Edit .env file, set ENCRYPTION_KEY to match the --to parameter
echo "ENCRYPTION_KEY=your-32-char-secret-key" >> .env

# 5. Restart the service
make run

Important Notes

⚠️ Important Reminders:

  • Once ENCRYPTION_KEY is lost, encrypted data CANNOT be recovered! Please securely backup this key. Consider using a password manager or secure key management system
  • Service must be stopped before migration to avoid data inconsistency
  • Strongly recommended to backup the database in case migration fails and recovery is needed
  • Keys should use 32 characters or longer random strings for security
  • Ensure ENCRYPTION_KEY in .env matches the --to parameter after migration
  • If disabling encryption, remove or clear the ENCRYPTION_KEY configuration

Key Generation Examples

# Generate secure random key (32 characters)
openssl rand -base64 32 | tr -d "=+/" | cut -c1-32

Web Management Interface

Access the management console at: http://localhost:3001 (default address)

Interface Overview

Dashboard
Key Management

The web management interface provides the following features:

  • Dashboard: Real-time statistics and system status overview
  • Key Management: Create and configure AI service provider groups, add, delete, and monitor API keys
  • Request Logs: Detailed request history and debugging information
  • System Settings: Global configuration management and hot-reload

API Usage Guide

Proxy Interface Invocation

GPT-Load routes requests to different AI services through group names. Usage is as follows:

1. Proxy Endpoint Format

http://localhost:3001/proxy/{group_name}/{original_api_path}
  • {group_name}: Group name created in the management interface
  • {original_api_path}: Maintain complete consistency with original AI service paths

2. Authentication Methods

Configure Proxy Keys in the web management interface, which supports system-level and group-level proxy keys.

  • Authentication Method: Consistent with the native API, but replace the original key with the configured proxy key.
  • Key Scope: Global Proxy Keys configured in system settings can be used in all groups. Group Proxy Keys configured in a group are only valid for the current group.
  • Format: Multiple keys are separated by commas.

3. OpenAI Interface Example

GPT-Load currently supports two OpenAI-compatible group types:

  • openai (OpenAI Chat Completions format)
  • openai-response (OpenAI Responses format)

Assuming a group named openai was created:

Original invocation:

curl -X POST https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer sk-your-openai-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4.1-mini", "messages": [{"role": "user", "content": "Hello"}]}'

Proxy invocation:

curl -X POST http://localhost:3001/proxy/openai/v1/chat/completions \
  -H "Authorization: Bearer your-proxy-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4.1-mini", "messages": [{"role": "user", "content": "Hello"}]}'

Changes required:

  • Replace https://api.openai.com with http://localhost:3001/proxy/openai
  • Replace original API Key with the Proxy Key

OpenAI Responses format example (openai-response group):

curl -X POST http://localhost:3001/proxy/openai-response/v1/responses \
  -H "Authorization: Bearer your-proxy-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4.1-mini", "input": "Hello"}'

4. Gemini Interface Example

Assuming a group named gemini was created:

Original invocation:

curl -X POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro:generateContent?key=your-gemini-key \
  -H "Content-Type: application/json" \
  -d '{"contents": [{"parts": [{"text": "Hello"}]}]}'

Proxy invocation:

curl -X POST http://localhost:3001/proxy/gemini/v1beta/models/gemini-2.5-pro:generateContent?key=your-proxy-key \
  -H "Content-Type: application/json" \
  -d '{"contents": [{"parts": [{"text": "Hello"}]}]}'

Changes required:

  • Replace https://generativelanguage.googleapis.com with http://localhost:3001/proxy/gemini
  • Replace key=your-gemini-key in URL parameter with the Proxy Key

5. Anthropic Interface Example

Assuming a group named anthropic was created:

Original invocation:

curl -X POST https://api.anthropic.com/v1/messages \
  -H "x-api-key: sk-ant-api03-your-anthropic-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{"model": "claude-sonnet-4-20250514", "messages": [{"role": "user", "content": "Hello"}]}'

Proxy invocation:

curl -X POST http://localhost:3001/proxy/anthropic/v1/messages \
  -H "x-api-key: your-proxy-key" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{"model": "claude-sonnet-4-20250514", "messages": [{"role": "user", "content": "Hello"}]}'

Changes required:

  • Replace https://api.anthropic.com with http://localhost:3001/proxy/anthropic
  • Replace the original API Key in x-api-key header with the Proxy Key

6. Supported Interfaces

OpenAI Chat Completions Format (openai):

  • /v1/chat/completions - Chat conversations
  • /v1/completions - Text completion
  • /v1/embeddings - Text embeddings
  • /v1/models - Model list
  • And all other OpenAI-compatible interfaces

OpenAI Responses Format (openai-response):

  • /v1/responses - Unified response generation
  • /v1/models - Model list
  • And all other OpenAI Responses-compatible interfaces

Gemini Format:

  • /v1beta/models/*/generateContent - Content generation
  • /v1beta/models - Model list
  • And all other Gemini native interfaces

Anthropic Format:

  • /v1/messages - Message conversations
  • /v1/models - Model list (if available)
  • And all other Anthropic native interfaces

7. Client SDK Configuration

OpenAI Python SDK:

from openai import OpenAI

client = OpenAI(
    api_key="your-proxy-key",  # Use the proxy key
    base_url="http://localhost:3001/proxy/openai"  # Use proxy endpoint
)

response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[{"role": "user", "content": "Hello"}]
)

Google Gemini SDK (Python):

import google.generativeai as genai

# Configure API key and base URL
genai.configure(
    api_key="your-proxy-key",  # Use the proxy key
    client_options={"api_endpoint": "http://localhost:3001/proxy/gemini"}
)

model = genai.GenerativeModel('gemini-2.5-pro')
response = model.generate_content("Hello")

Anthropic SDK (Python):

from anthropic import Anthropic

client = Anthropic(
    api_key="your-proxy-key",  # Use the proxy key
    base_url="http://localhost:3001/proxy/anthropic"  # Use proxy endpoint
)

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "Hello"}]
)

Important Note: As a transparent proxy service, GPT-Load completely preserves the native API formats and authentication methods of various AI services. You only need to replace the endpoint address and use the Proxy Key configured in the management interface for seamless migration.

  • New API - Excellent AI model aggregation management and distribution system

Contributing

Thanks to all the developers who have contributed to GPT-Load!

Contributors

Supporters

  • Thank you very much for the support from the LINUX DO community!

  • This project is supported by DigitalOcean. DigitalOcean Referral Badge

License

MIT License - see LICENSE file for details.

Star History

Stargazers over time