Envoy AI Gateway
December 18, 2025 ยท View on GitHub
Envoy AI Gateway is an open source project for using Envoy Gateway to handle request traffic from application clients to Generative AI services.
Usage
When using Envoy AI Gateway, we refer to a two-tier gateway pattern. The Tier One Gateway functions as a centralized entry point, and the Tier Two Gateway handles ingress traffic to a self-hosted model serving cluster.
- The Tier One Gateway handles authentication, top-level routing, and global rate limiting
- The Tier Two Gateway provides fine-grained control over self-hosted model access, with endpoint picker support for LLM inference optimization.

Supported AI Providers
Envoy AI Gateway supports a wide range of AI providers, making it easy to integrate with your preferred LLM services:
|
OpenAI |
Azure OpenAI |
Google Gemini |
Vertex AI |
AWS Bedrock |
|
Mistral |
Cohere |
Groq |
Together AI |
DeepInfra |
|
DeepSeek |
Hunyuan |
SambaNova |
Grok |
Tetrate Agent Router Service |
|
Anthropic |
Documentation
- Blog introducing Envoy AI Gateway.
- Documentation for Envoy AI Gateway.
- Quickstart to use Envoy AI Gateway in a few simple steps.
- Concepts to understand the architecture and resources of Envoy AI Gateway.
- Talks and Presentations about Envoy AI Gateway.
Contact
- Slack: Join the Envoy Slack workspace if you're not already a member. Otherwise, use the Envoy AI Gateway channel to start collaborating with the community.
Get Involved
We adhere to the CNCF Code of conduct
The Envoy AI Gateway team and community members meet every Monday. Please register for the meeting, add agenda points, and get involved. The meeting details are available in the public document.
To contribute to the project via pull requests, please read the CONTRIBUTING.md file which includes information on how to build and test the project.
Background
The proposal of using Envoy Gateway as a Cloud Native LLM Gateway inspired the initiation of this project.