๐ OmniRoute
July 2, 2026 ยท View on GitHub
๐ OmniRoute โ The Free AI Gateway
Never stop coding. Connect every AI tool to 237 providers โ 90+ free โ through one endpoint.
Plug Claude Code, Codex, Cursor, Cline, Copilot & Antigravity into FREE Claude / GPT / Gemini. Auto-fallback.
RTK + Caveman compression saves 15โ95% tokens. Never hit limits.
~1.6B documented free tokens/month โ up to ~2.1B in your first month with signup credits โ aggregated across the free tiers, plus a long tail of permanently-free, no-cap providers, and the compression above stretches every one further. (how we count โ)
โญ Star the repo if OMNIROUTE helped you save money and make your work easier.
๐ฌ Join the community
Questions, provider tips, roadmap & support โ Discord ยท Telegram ยท WhatsApp ๐ Global / ๐ง๐ท Brasil
๐งฉ Available
๐ Quick Start โข ๐ฏ Combos โข ๐ Providers โข ๐ CLI & MCP โข ๐๏ธ Compression โข ๐ Website
๐ฅ The Promise โข ๐ค Why โข ๐ What Sets Apart โข ๐ค Compatible CLIs โข ๐ฅ๏ธ Where It Runs โข ๐ Private โข ๐ฌ In Action โข ๐ Explore More โข ๐ง Support
Stacking free tiers by hand is painful โ dozens of SDKs, dozens of rate limits, and no idea how much you actually have. OmniRoute aggregates the documented free tiers of 40+ provider pools / 500+ models into one honest number and shows it live on the dashboard (
/dashboard/free-tiers).
- ~1.6B free tokens / month (steady) โ and up to ~2.1B in your first month with signup credits.
- Pool-deduped, honest โ we count each shared free pool once, so the headline isn't inflated by rate-limit ceilings the way multi-billion competitor claims are. (Counting every rate limit 24/7 would read ~10B; we don't publish that.)
- Plus the un-countable โ permanently-free, no-token-cap providers (SiliconFlow, Z.AI GLM-Flash, Kilo, OpenCode Zenโฆ) and a $10 OpenRouter top-up that unlocks +24M/mo, both surfaced separately so they never inflate the headline.
- Per-model breakdown, live used / remaining for the current month, and a transparent terms flag per provider.
Preview mockup โ a real screenshot lands once the
/dashboard/free-tierspage is validated. Full methodology (pool dedupe, credit tiers, provider terms): docs/reference/FREE_TIERS.md.
One endpoint. 237 providers. Never stop building โ and let OmniRoute pick the cheapest one that works.
| ๐ซ Never hit limits Auto-fallback across 237 providers in milliseconds. Quota out? Next provider takes over โ zero downtime. |
๐ธ Save up to 95% tokens RTK + Caveman stacked compression cuts 15โ95% of eligible tokens (~89% avg on tool-heavy sessions). |
๐ \$0 to start 90+ providers with a free tier, 11 free forever (Kiro, Qoder, Pollinations, LongCatโฆ). No card needed. |
| ๐ Every tool works 24+ coding agents โ Claude Code, Codex, Cursor, Cline, Copilot, Antigravity โ through one config. |
๐งฉ One endpoint OpenAI โ Claude โ Gemini โ Responses API translation. Point any tool at /v1 and it just works. |
๐ก๏ธ Production-grade Circuit breakers, TLS stealth, MCP (95 tools), A2A, memory, guardrails, evals. 21,000+ tests. |
Stop juggling 10 dashboards, dead API keys, and surprise bills.
| โ The daily pain | โ How OmniRoute fixes it |
|---|---|
| ๐ Subscription quota expires unused every month | Maximize subscriptions โ track quota, use every token before reset |
| ๐ Rate limits stop you mid-coding | 4-tier auto-fallback โ Subscription โ API โ Cheap โ Free, in milliseconds |
๐ฅ Tool outputs (git diff, grep, logs) burn tokens | RTK + Caveman compression โ save 15โ95% eligible tokens per request |
| ๐ธ Expensive APIs ($20โ50/mo per provider) | Cost-optimized routing โ auto-route to the cheapest viable model |
| ๐งฐ Each AI tool wants its own setup | One endpoint, every tool, one dashboard |
| ๐ AI blocked in your country | 3-level proxy + TLS fingerprint stealth โ use AI from anywhere |
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Your IDE / CLI (Claude Code, Cursor, Clineโฆ) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ http://localhost:20128/v1
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ OmniRoute โ Smart Router โ
โ RTK + Caveman compression ยท 17 routing strategies โ
โ Circuit breakers ยท TLS stealth ยท MCP ยท A2A ยท Guardrails โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโฌโโโโโดโโโโโโโโโฌโโโโโโโโโโโโโโ
โผ Tier 1 โผ Tier 2 โผ Tier 3 โผ Tier 4
SUBSCRIPTION API KEY CHEAP FREE
Claude Code, DeepSeek, GLM \$0.5, Kiro, Qoder,
Codex, Copilot Groq, xAI MiniMax \$0.2 Pollinations
quota out? โโโโถ budget hit? โโถ budget hit? โโถ always on
A combo is a chain of models OmniRoute routes across automatically. Quota runs out, a provider fails, or costs spike โ the combo silently slides to the next model. This is what makes OmniRoute unbreakable. ๐ก๏ธ
โก Zero-config โ just use auto
No combo to create. Set your model to auto (or a variant) and OmniRoute builds a virtual combo from your connected providers, scored live:
| Model ID | What it optimizes for |
|---|---|
auto | ๐ฏ Balanced default (LKGP โ sticks to your last good provider) |
auto/coding | ๐งโ๐ป Quality-first weights for code generation |
auto/fast | โก Lowest latency first |
auto/cheap | ๐ฐ Cheapest per token first |
auto/offline | ๐ Most quota / rate-limit headroom first |
auto/smart | ๐ญ Quality-first + 10% exploration to discover better models |
๐ Or build your own โ 17 routing strategies
All 17 strategies โ mix & match per combo step:
| # | Strategy | What it does |
|---|---|---|
| 1 | priority | First-target ordered list โ drain each before the next ๐ฅ |
| 2 | fill-first | Fill each target's quota fully before moving on |
| 3 | weighted | Weighted random by per-target weight |
| 4 | round-robin | Cycle through targets in order |
| 5 | p2c | Power-of-two-choices random load balancing |
| 6 | least-used | Pick the target with the lowest current load |
| 7 | random | Uniform random pick (deduplicated) |
| 8 | strict-random | Random without de-duplicating repeats ๐ฒ |
| 9 | cost-optimized | Minimize $ per request from live catalog pricing ๐ธ |
| 10 | headroom | Pick the target with the most remaining quota |
| 11 | reset-window | Prefer the target whose quota window resets soonest |
| 12 | reset-aware | Rank by quota reset time โ short windows first ๐ |
| 13 | context-relay | Hand off context across targets for long conversations ๐ง |
| 14 | context-optimized | Pick the best fit for the current context size |
| 15 | lkgp | Last-Known-Good Path โ sticky to the last successful target |
| 16 | auto | 9-factor live scoring across every connection ๐ค |
| 17 | fusion | Fan out to a panel of models + a judge synthesizes one answer ๐งฌ |
The Auto-Combo engine scores every candidate on 9 factors (health, quota, cost, latency, success rate, freshnessโฆ) โ see docs/routing/AUTO-COMBO.md.
โ๏ธ Quota-Share โ split one subscription across a team โจ NEW
Running several keys against the same upstream account (one Codex Pro plan, one Kimi key, one GLM Coding seat)? A burst on one key can burn the whole 5-hour / hourly quota and lock everyone else out. Quota-Share distributes a provider's time-based quota fairly across the keys in a pool โ and it's work-conserving, so an idle member's slice is lent out instead of wasted.
| Knob | What it controls |
|---|---|
| โ๏ธ Allocation weight | each key's slice of the pool โ e.g. 50 / 30 / 20 |
| ๐ Dimensions | track % ยท requests ยท tokens ยท $, per 5h / 7d / per-model window |
| ๐ฆ Policy | hard (block over share) ยท soft (deprioritize) ยท burst (use idle headroom) |
| ๐งฑ Cap | absolute ceiling per key, independent of mode |
Pool "team-codex" ยท 1 Codex Pro account ยท 3 keys ยท 5-hour window
โโ alice weight 50 โโโโโโโโโโโโโโโโโโโโ โค 50% of the shared 5h quota
โโ bob weight 30 โโโโโโโโโโโโโโโโโโโโ โค 30%
โโ ci-bot weight 20 โโโโโโโโโโโโโโโโโโโโ โค 20%
Generous mode (<50% pool used) โ idle shares are lent out
Strict mode (โฅ50% pool used) โ each key held to its fair share
Enforced in the hot path before the request leaves OmniRoute, with per-(key, model) caps + session stickiness for prompt-cache integrity. ๐ Quota Sharing Engine
๐งฑ Resilience is built in (3 independent layers)
| Layer | Scope | What it does |
|---|---|---|
| ๐ Circuit breaker | whole provider | Stops hammering a provider that's failing upstream; auto-probes to recover |
| ๐ค Connection cooldown | one account / key | Skips a rate-limited key while other keys keep serving |
| ๐ฏ Model lockout | provider + model | Quarantines just one quota-limited model, not the whole connection |
Combo: "always-on" Strategy: priority
1. cc/claude-opus-4-7 โ subscription (use it fully)
2. cx/gpt-5.5 โ second subscription
3. glm/glm-5.1 โ cheap backup (\$0.5/1M)
4. kr/claude-sonnet-4.5 โ FREE, unlimited (never fails)
Result: 4 layers of fallback = zero downtime
๐ Auto-Combo Engine ยท Resilience Guide
| Feature | OmniRoute | Other routers |
|---|---|---|
| ๐ Providers | 237 | 20โ100 |
| ๐ Free providers | 90+ (11 free forever) | 1โ5 |
| ๐ Routing strategies | 17 (priority, weighted, cost-optimized, context-relay, fusionโฆ) | 1โ3 |
| ๐๏ธ Token compression | RTK + Caveman stacked (15โ95%) | None / 20โ40% |
| ๐งฐ Built-in MCP server | 95 tools, 3 transports, 30 scopes | Rare |
| ๐ค A2A agent protocol | 6 skills, JSON-RPC 2.0 | None |
| ๐ง Memory (FTS5 + vector) | Yes | Rare |
| ๐ก๏ธ Guardrails (PII, injection, vision) | Yes | Rare |
| โ๏ธ Cloud agents | Codex, Cursor, Devin, Jules | None |
| ๐ฅท TLS fingerprint stealth | JA3/JA4 via wreq-js | None |
| ๐ฅ๏ธ Multi-platform | Web ยท Desktop ยท Termux ยท PWA | Web only |
| ๐ i18n | 42 locales | 0โ4 |
๐ Detailed comparison vs LiteLLM, OpenRouter & Portkey โ docs/comparison/OMNIROUTE_VS_ALTERNATIVES.md
Recent highlights from v3.8.20 โ v3.8.43. Full history in
CHANGELOG.md.
- ๐๏ธ Compression hardening โ a default-on inflation guard (discard the stacked result and send the verbatim original whenever compression would grow the prompt), completed Caveman rule packs for German / French / Japanese (dedup + ultra) plus a new Chinese (ๆ่จ / wรฉnyรกn) input pack with zh-vs-ja auto-detection, and RTK filters for Gradle & .NET (
dotnet) build output. โ Compression - ๐ธ Honest flat-rate cost โ subscription / coding-plan providers (ChatGPT Web, grok-web, the Minimax / Kimi / GLM / Alibaba Coding plans, Xiaomi MiMoโฆ) now read $0 in cost analytics instead of an inflated per-token estimate, while budget / quota / routing keep estimating unchanged. โ API Reference
- โ๏ธ Quota-Share routing โ a dedicated combo strategy that spreads load across accounts by available quota: Deficit-Round-Robin scheduling, per-connection
max_concurrentwith cooldown-wait queueing, multi-window usage buckets (5h / 7d / per-model), per-(key, model) caps, session stickiness for prompt-cache integrity, and proactive saturation from upstream token-usage headers. โ Resilience Guide - ๐ค One-command CLI/agent setup โ a dedicated
setup-*command configures each coding tool to route through OmniRoute (Claude Code, Codex, Cline, Continue, Cursor, Roo Code, Kilo Code, Crush, Goose, Qwen Code, Aider, OpenCode);omniroute launch/omniroute launch-codexare zero-config launchers. โ CLI Integrations - ๐ฐ๏ธ Remote mode โ drive a remote OmniRoute from any machine with scoped access tokens (
omniroute connect/omniroute contexts/omniroute tokens), plus anomniroute login antigravityhelper that runs Google "native/desktop" OAuth on your own machine and pastes a credential blob into a remote/VPS install (where the loopback redirect is unreachable). โ Remote Mode - ๐งญ Smarter auto-routing โ OpenRouter-style
auto/<category>:<tier>combos (e.g.auto/coding:fast,auto/reasoning:pro), a Fusion strategy (fan out to a panel of models in parallel, then synthesize via a judge), task-aware routing (best-fit connection per task type), per-requestX-Route-Modeloverride, live Arena-ELO + models.dev model intelligence, per-step account allowlists, provider-wildcard combo steps, nested combo-ref execution, sticky weighted selection, andweb_search-aware routing. โ Auto-Combo - ๐๏ธ Pluggable compression โ an async pipeline of 10 composable engines with Compression Studios, an LLMLingua-2 ONNX engine and a heuristic/SLM two-tier Ultra, RTK, delegated Anthropic Context Editing, Output Styles (output-axis steering: terse-prose / less-code / terse-CJK), an adaptive context-budget dial (escalate only as far as needed to fit the context window), per-request
x-omniroute-compressioncontrol, an opt-in offline eval harness, one-click Headroom proxy lifecycle management from the dashboard (Docker sidecar supported), a synthetic compression playground (Play lanes + A/B Compare with USD-capped fidelity verdicts), an opt-in per-step fidelity gate that rejects a lossy engine before it degrades the prompt, a best-of-N candidate encoder (GCF vs TOON โ keep whichever is shorter, with an A/B bytes/token table in the studio), CCR ranged/grep/stats retrieval (pull an exact byte/line slice or summary of a stored block instead of re-expanding it), a unified panel with named profiles + an active-profile selector, an opt-in per-engine pipeline circuit-breaker, an opt-in LLM-tier engine (a model pass for higher-ratio semantic compression), a read-lifecycle engine that collapses superseded file reads, usage-observed prefix freeze, a graduated CCR retrieval-feedback ramp, apreserveSystemPromptmode enum, and a drag-reorder pipeline editor in the studio. โ Compression - ๐ต๏ธ Transparent MITM decrypt (TPROXY) โ capture & translate traffic from CLIs that ignore proxy env vars, with a per-SNI certificate authority and a trust-store installer. โ MITM/TPROXY
- ๐ธ Cost telemetry everywhere โ
X-OmniRoute-*cost/usage headers on every endpoint (including media), a non-token cost engine, a cache-HITX-OmniRoute-Cost-Savedheader, and per-key USD spend quotas. โ API Reference - ๐ง Memory you control โ opt-in int8 vector quantization (Qdrant + sqlite-vec), opt-in typed memory decay (aged low-value memories fade on a per-type schedule), memory off by default, and a per-request
x-omniroute-no-memoryheader. โ Memory - ๐ก๏ธ Security โ a prompt-injection guard across every LLM route (backed by a red-team suite), plus a free DuckDuckGo last-resort web search. โ Guardrails
- ๐ค More providers & agents โ Cursor Cloud Agent (a 4th cloud agent), CodeBuddy CN (
copilot.tencent.com), a Google Flow video-generation provider, new gateways DGrid and Pioneer AI (Fastino Labs), inbound xAI Grok translators plus Grok Build (xAI) with an OAuth import-token flow, GPT-4 / GPT-4o-mini on the GitHub Copilot provider, multi-model Factory Droid, ZenMux Free (session-cookie free tier), Alibaba DashScope text-to-video (wan2.7-t2v), a refreshed 237-provider catalog (OrcaRouter, Wafer AI, OpenAdapter, dit.ai, TokenRouter, โฆ), Vertex AI media generation (speech/transcription/music/video), a first-class Ollama local-provider card, the SenseNova free Token Plan (chat + text-to-image), and one-click account import from CLIProxyAPI (~/.cli-proxy-api/). โ Providers - โก Local performance & infra โ a one-click local Redis launcher (
omniroute redis up, plus a dashboard Redis panel), one-click Cloudflare Workers and Deno Deploy relay deployers wired into the proxy pool, and an optional Bifrost Go sidecar that offloads the hottest relay path (BIFROST_BASE_URL, with automatic fallback to the TypeScript path on timeout) โ now with a relay-backend selector (OMNIROUTE_RELAY_BACKEND=ts|bifrost|auto) so the/v1/relayendpoint stays the stable surface while choosing the fastest backend internally. โ Environment
๐ค Compatible CLIs & Coding Agents
One config โ
http://localhost:20128/v1โ and every AI IDE or CLI runs on free & low-cost models.
Claude Code |
Codex CLI |
![]() Cursor |
![]() Copilot |
![]() Continue |
|
OpenCode |
Kilo Code |
Droid |
![]() OpenClaw |
Kiro |
Command |
๐ Per-tool setup for all 24+ tools โ docs/reference/CLI-TOOLS.md ยท ๐งฉ OpenCode plugin โ @omniroute/opencode-provider
The most complete catalog of any open-source router: 237 providers, 90+ with a free tier, 11 free forever.
๐ข Every major lab โ through one endpoint
OpenAI |
Anthropic |
Gemini |
xAI Grok |
DeepSeek |
Mistral |
Qwen |
Meta Llama |
Groq |
NVIDIA |
MiniMax |
Cohere |
Perplexity |
HuggingFace |
Together |
Fireworks |
Cloudflare |
Baidu |
โฆand 220+ more โ every icon resolves live from the dashboard's provider catalog. ๐ Provider Reference
๐ Free Forever โ $0, no card
AgentRouter GPT-5, Claude, Gemini \$100 free credits |
Qoder AI Kimi-K2, DeepSeek-R1 Unlimited FREE |
Pollinations GPT-5, Claude, Llama 4 No key needed |
LongCat LongCat-2.0 10M tokens one-time (KYC) ๐ |
Cloudflare AI 50+ models 10K neurons/day |
NVIDIA NIM 129 models ~40 RPM free |
Cerebras Qwen3 235B 1M tokens/day |
๐ Full machine-readable catalog โ docs/reference/PROVIDER_REFERENCE.md
Same app, your machine, your rules. From a global npm install to your phone via Termux.
| Platform | Install | Highlights |
|---|---|---|
| ๐ฆ npm (global) | npm install -g omniroute | One command, any OS |
| ๐ณ Docker | docker run โฆ diegosouzapw/omniroute | Multi-arch AMD64 + ARM64 |
| ๐ฅ๏ธ Desktop (Electron) | npm run electron:build | Native window + system tray โ Windows / macOS / Linux |
| ๐ช ARM | native arm64 | Raspberry Pi, ARM servers, Apple Silicon |
| ๐ฑ Android (Termux) | pkg install nodejs && npx -y omniroute | Runs on your phone, 24/7, no root |
| ๐ฒ PWA | "Add to Home Screen" | Fullscreen, offline, installable from browser |
| ๐งฉ OpenCode plugin | @omniroute/opencode-provider | Native OpenCode integration |
| ๐ ๏ธ From source | npm install && npm run dev | Hack on it, contribute |
๐ Docker Guide ยท Desktop ยท Termux ยท PWA ยท OpenCode
Your keys, your machine, your data. OmniRoute is a local proxy โ it never phones home.
- ๐ Runs 100% on your hardware โ npm, Docker, desktop, or your phone. No OmniRoute cloud sits in the request path.
- ๐ Credentials encrypted at rest โ API keys & OAuth tokens sealed with AES-256-GCM.
- ๐ซ Zero telemetry by default โ your prompts go only to the providers you choose, nowhere else.
- ๐ก๏ธ Hardened gateway โ API-key scoping, IP filtering, rate limits, prompt-injection guard, loopback-only process routes.
- ๐ MIT licensed & fully open-source โ audit every line, self-host forever.
๐ Authorization ยท Guardrails ยท Compliance
OmniRoute isn't just a server โ it's a full command-line cockpit with 80+ commands, plus open agent protocols so an AI agent can drive OmniRoute by itself.
โจ๏ธ A real CLI (not just start)
omniroute # serve gateway + dashboard (port 20128)
omniroute chat # interactive TUI chat client (slash: /model /combo /skill /memory)
omniroute setup # guided first-run wizard
omniroute doctor # diagnose providers, ports, native deps
๐ฐ๏ธ Remote mode โ run the CLI here, OmniRoute on a VPS
OmniRoute on a server? Drive it from your laptop with the same CLI. Log in once with a scoped access token; every command then targets the remote.
omniroute connect 192.168.0.15 # password โ scoped token, saved as a context
omniroute models list # โ runs against the REMOTE server
omniroute configure codex # โ picks a remote model, writes a local Codex profile
omniroute tokens create --name ci --scope read # mint narrower tokens for other machines
omniroute contexts use default # โ switch back to the local server
Tokens are scoped read / write / admin; process-spawning routes stay loopback-only.
๐ Remote Mode
providers ยท oauth ยท keys ยท combo ยท nodes ยท models ยท cache ยท compression ยท cost ยท usage ยท quota ยท health ยท resilience ยท telemetry ยท logs ยท audit ยท mcp ยท a2a ยท cloud ยท memory ยท skills ยท eval ยท tunnel ยท backup ยท sync ยท webhooks ยท policy ยท pricing ยท translator ยท simulate โฆ
๐ค Connect an agent โ and it controls OmniRoute itself
Expose OmniRoute over MCP or A2A and any capable agent gets the keys to the whole gateway โ routing, providers, combos, cache, compression, memory โ autonomously.
| Protocol | Endpoint | Use it for |
|---|---|---|
| ๐งฐ MCP (stdio) | omniroute --mcp | Plug into Claude Desktop, Cursor, any MCP client |
| ๐ MCP (HTTP) | http://localhost:20128/api/mcp/stream | Remote MCP โ 95 tools, 30 scopes, full audit trail |
| ๐ก MCP (SSE) | http://localhost:20128/api/mcp/sse | Streaming MCP transport |
| ๐ค A2A | http://localhost:20128/.well-known/agent.json | Agent-to-agent, JSON-RPC 2.0 + SSE, 6 skills |
# Give Claude Code the full OmniRoute toolset over MCP:
claude mcp add-server omniroute --type http --url http://localhost:20128/api/mcp/stream
๐ MCP Server ยท A2A Server ยท Agent Protocols
Why use many tokens when few tokens do the trick? Every request passes through OmniRoute's compression pipeline transparently โ no client changes. It's now a stack of 10 composable engines that run in order and mix & match per routing combo โ building on ideas from RTK, Caveman (โญ 78K+), LLMLingua-2, and Troglodita (PT-BR).
๐งฑ The 10-engine stack
Engines run in pipeline order; each is independently toggleable and configurable per combo:
| # | Engine | What it does |
|---|---|---|
| 1 | Session-Dedup | Drops content repeated across turns (content-addressed, cross-turn) |
| 2 | CCR | Archives large blocks behind retrieve markers, fetched on demand |
| 3 | RTK | Smart tool-result filtering, dedup & truncation (command-aware) |
| 4 | Headroom | Lossless tabular compaction of homogeneous JSON arrays (~30%+) |
| 5 | Relevance | Extractive sentence scoring against the last user query |
| 6 | Caveman | Rule-based prose compression (~65โ75% on output) |
| 7 | LLMLingua-2 | ML semantic pruning via MobileBERT ONNX โ code-safe, async |
| 8 | Lite | Whitespace + image-URL trimming (latency-light baseline) |
| 9 | Aggressive | Summarization + progressive aging of old turns |
| 10 | Ultra | Heuristic token pruning with an optional small-model (SLM) tier |
Code blocks, URLs and structured data are always preserved byte-perfect. One-click presets combine the engines:
| Mode | Savings | Best for |
|---|---|---|
| ๐ชถ Lite | ~15% | Always-on safe default |
| ๐ชจ Standard (Caveman) | ~30% | Daily coding |
| โก Aggressive | ~50% | Long tool-heavy sessions |
| ๐ฅ Ultra | ~75% | Maximum savings |
| ๐งฐ RTK | 60โ90% | Shell/test/build/git output |
| ๐ Stacked (RTK โ Caveman) | 78โ95% | Mixed prompts + tool logs |
Real example โ Standard mode:
Before (69 tokens): "The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle. When you pass an inline object as a prop, React's shallow comparison sees it as a different object every time, which triggers a re-render. I would recommend using useMemo to memoize the object."
After (19 tokens): "New object ref each render. Inline object prop = new ref = re-render. Wrap in useMemo."
Same answer. 72% fewer tokens. Zero accuracy loss. โ
PT-BR example โ Troglodita mode:
Antes (42 tokens): "O problema รฉ que o componente estรก re-renderizando porque uma nova referรชncia de objeto estรก sendo criada em cada ciclo de renderizaรงรฃo. Eu recomendaria usar useMemo."
Depois (12 tokens): "Re-render: ref nova cada ciclo (objeto inline recriado). Usar
useMemo."Mesma resposta. ~70% menos tokens. Precisรฃo tรฉcnica intacta. โ
๐ How it works โ pipeline, architecture & savings math
Client (10,000 tok) โโโถ OmniRoute Compression (10 engines) โโโถ Provider (~1,080 tok, up to 95% saved)
Default stacked combo runs RTK โ Caveman. When both act on the same tool/context payload, savings compound:
combined = 1 โ (1 โ RTK) ร (1 โ Caveman_input)
average = 1 โ (1 โ 0.80) ร (1 โ 0.46) = 89.2%
range = 78.4 โ 94.6%
Code blocks, URLs, JSON and structured data are always protected by the preservation engine.
๐๏ธ Beyond the engines โ output styles, the adaptive dial & per-request control
The 10 engines above shrink what goes in. Three more layers shape how, when, and what comes out:
- ๐ช Output Styles (output-axis steering) โ inject deterministic, cache-safe response-shaping instructions; combinable, each at
lite/full/ultraintensity. Adding a style is a one-line registry entry:- Terse prose โ drop filler / articles / hedging; keep technical substance exact.
- Less code โ "lazy senior dev" YAGNI: smallest working change, no unrequested scaffolding.
- Terse CJK (ๆ่จ) โ classical-Chinese ultra-terse style (locale-gated to
zh).
- ๐ฏ Adaptive context-budget (the dial) โ instead of one on/off token threshold, escalate the cheapest, most-lossless engines only as far as needed to fit the model's context window. Policy:
reserve-output(default, model-aware) ยทpercentageยทabsolute. Mode:floor(guarantee fit) ยทreplace-autotrigger(your explicit choice wins) ยทoff(legacy threshold). - ๐๏ธ Where compression is decided (precedence, high โ low) โ per-request
x-omniroute-compressionheader โบ routing-combo override โบ active named profile โบ adaptive / auto-trigger โบ panel default โบ off. The applied plan echoes back in theX-OmniRoute-Compression: <mode>; source=<source>response header.
Auto-trigger by token threshold, flip on the adaptive dial, pin a named profile, set a one-off per request, or assign a pipeline per routing combo โ whichever fits the workload. An opt-in offline eval harness (npm run eval:compression) scores fidelity vs. savings on a pinned corpus before you promote a change.
๐ COMPRESSION_GUIDE.md ยท RTK_COMPRESSION.md ยท COMPRESSION_ENGINES.md
1) Install & run
npm install -g omniroute
omniroute
Dashboard at http://localhost:20128 ยท API at http://localhost:20128/v1.
2) Connect a FREE provider (no signup)
Dashboard โ Providers โ connect Kiro AI (free Claude, ~50 credits/month per account) or OpenCode Free (no auth) โ done.
3) Point your coding tool
Base URL: http://localhost:20128/v1
API Key: [copy from Dashboard โ Endpoints]
Model: auto (zero-config smart routing โ or any provider/model)
4) Verify it's working
curl http://localhost:20128/v1/models -H "Authorization: Bearer YOUR_KEY"
You should see your connected models listed. ๐ That's it โ start coding, and OmniRoute auto-routes & falls back for you.
If your client cannot send custom headers, OmniRoute also exposes tokenized compatibility aliases:
OpenAI catalog: http://localhost:20128/vscode/YOUR_KEY/
OpenAI models: http://localhost:20128/vscode/YOUR_KEY/models
OpenAI chat: http://localhost:20128/vscode/YOUR_KEY/chat/completions
OpenAI responses: http://localhost:20128/vscode/YOUR_KEY/responses
Ollama chat: http://localhost:20128/vscode/YOUR_KEY/api/chat
Ollama tags: http://localhost:20128/vscode/YOUR_KEY/api/tags
Use these only for clients that cannot attach Authorization: Bearer .... Header auth remains the preferred mode.
๐ฆ More install methods โ Docker, source, pnpm, Arch
๐ณ Docker
docker run -d --name omniroute --restart unless-stopped --stop-timeout 40 \
-p 20128:20128 -v omniroute-data:/app/data diegosouzapw/omniroute:latest
๐ ๏ธ From source
cp .env.example .env && npm install
PORT=20128 npm run dev
๐ฆ pnpm
pnpm add -g omniroute@latest --allow-build=better-sqlite3 --allow-build=@swc/core && omniroute
๐ง Arch Linux (AUR)
yay -S omniroute-bin && systemctl --user enable --now omniroute.service
๐ง Nix (Flake)
# Using Nix flakes
nix develop
npm run dev
# Or using devbox
devbox run npm run dev
๐ Docker Guide โ Compose profiles, Caddy HTTPS, Cloudflare tunnels.
๐ฆญ Podman
# 1. Build the image
podman build --target runner-base -t omniroute:base .
# 2. Fix data directory permissions for rootless Podman
mkdir -p data && podman unshare chown 1000:1000 ./data
# 3. Set runtime in .env, then run (see contrib/podman/ for Quadlet)
echo "CONTAINER_HOST=podman" >> .env
podman compose --profile base up -d
๐ Podman Guide โ Quadlet setup, podman-compose, Quadlet.
โก Faster / leaner install (skip the native build)
The native SQLite engine (better-sqlite3) is an optional dependency, so a global
install never blocks on compiling from source: it uses a prebuilt binary when one matches
your platform/Node, and otherwise falls back transparently to a pure-JS engine
(node:sqlite on Node 22+, else the bundled sql.js WASM) โ no build tools required.
To skip the post-install native warm-up entirely (CI, headless, or slow machines):
OMNIROUTE_SKIP_POSTINSTALL=1 npm install -g omniroute # CI=1 also skips it
For the fastest installs prefer pnpm (content-addressed store + hard links โ see above).
For a dashboard-free, headless runtime use the Docker base profile (above) or the
Termux guide. The CLI and the web dashboard are served by the
same process on one port, so there is no separate CLI-only package today.
![]() ๐ง๐ท Portuguรชs Guia completo |
![]() ๐บ๐ธ English Complete walkthrough |
![]() ๐ท๐บ ะ ัััะบะธะน ะะพะปะฝะพะต ััะบะพะฒะพะดััะฒะพ |
๐ฌ Made a video about OmniRoute? Open an issue or discussion with the link โ we'll feature it here.
๐ฐ Pricing at a glance & the \$0 Free Stack (11 providers)
| Tier | Example | Cost |
|---|---|---|
| ๐ณ Subscription | Claude Code Pro / Codex / Copilot | $10โ200/mo |
| ๐ API Key (free tiers) | NVIDIA NIM, Cerebras, Groq | FREE |
| ๐ฐ Cheap | GLM-5 $0.5/1M ยท MiniMax M2.5 $0.3/1M | pennies |
| ๐ Free Forever | Kiro, Qoder, Qwen, Pollinations, LongCat | $0 |
The $0 Free Stack โ combine into one unbreakable combo:
| Provider | Prefix | Free models | Quota |
|---|---|---|---|
| Kiro | kr/ | Claude Sonnet 4.5, Haiku 4.5, Opus 4.6 | 50 credits/mo |
| Qoder | if/ | kimi-k2-thinking, qwen3-coder-plus, deepseek-r1 | โพ๏ธ Unlimited |
| Qwen | qw/ | qwen3-coder-plus/flash/next | โพ๏ธ Unlimited |
| Pollinations | pol/ | GPT-5, Claude, Gemini, DeepSeek, Llama 4 | No key needed |
| LongCat | lc/ | LongCat-2.0 | 10M one-time (KYC) |
| Cloudflare AI | cf/ | 50+ models | 10K neurons/day |
| NVIDIA NIM | nvidia/ | 129 models | ~40 RPM |
| Cerebras | cerebras/ | Qwen3 235B, GPT-OSS 120B | 1M tok/day |
๐ก The dashboard "cost" is a savings tracker, not a bill โ OmniRoute never charges you. A "$290 total cost" using free models means $290 saved.
๐ Complete free directory โ docs/reference/FREE_TIERS.md โ 25+ providers, quotas, base URLs.
๐ฏ Use Cases โ ready-made combo playbooks
$0 forever:
1. kr/claude-sonnet-4.5 (Kiro โ ~50 credits/mo per acct)
2. if/kimi-k2-thinking (Qoder โ unlimited)
3. pol/gpt-5 (Pollinations โ no key)
4. lc/LongCat-2.0 (10M one-time backup, KYC)
Compression: aggressive (~50%) โ double your free quota ยท Cost: \$0/mo
24/7 no interruptions: chain 2 subscriptions โ cheap โ free for 5 layers of fallback.
Blocked region: free providers + global/per-provider proxy โ access AI from any country.
Max savings: subscription + cheap backup + ultra compression (~75%) โ ~$150โ300/mo saved for heavy users.
๐ Bypass geo-blocks โ 3-level proxy + stealth
๐ท๐บ ๐จ๐ณ ๐ฎ๐ท ๐จ๐บ ๐น๐ท In a blocked region? OmniRoute's 3-level proxy (Global / Per-Provider / Per-Connection) proxies API requests, OAuth flows, connection tests, token refresh & model sync.
- Protocols: HTTP/HTTPS, SOCKS5, authenticated proxies
- ๐ 1proxy marketplace โ hundreds of free validated proxies, quality scores, auto-rotation
- Anti-detection โ TLS fingerprint spoofing (
wreq-js), CLI fingerprint matching, proxy IP preservation
โจ Full feature list โ 30+ capabilities (memory, evals, observability)
Routing: 17 strategies ยท task-aware smart routing ยท thinking budget controls ยท wildcard routing ยท system prompt injection.
Compatibility: OpenAI โ Claude โ Gemini โ Responses API ยท auto OAuth refresh (PKCE, 8 providers) ยท multi-account round-robin ยท Batch + Files API ยท live OpenAPI 3.0.
Protocols: MCP (95 tools, 3 transports, 30 scopes) ยท A2A (JSON-RPC 2.0, SSE, 6 skills) ยท ACP ยท cloud agents (Codex, Cursor, Devin, Jules).
Plugins: custom plugin marketplace (system-configured registry URL with SSRF-guarded fetch) ยท install / enable / disable ยท Notion + Obsidian knowledge-base integrations (WebDAV file server, vault search, note CRUD).
Embedded services: one-click install & lifecycle management of local sidecar services (CLIProxy, NineRouter).
Quality & Ops: built-in Evals (golden-set: exact/contains/regex/custom) ยท guardrails (PII, injection, vision) ยท health dashboard ยท p50/p95/p99 telemetry ยท webhooks ยท compliance audit.
AI Agent Skills: drop-in markdown manifests โ point any agent at a skills/*/SKILL.md manifest. 43 skills available.
๐ MCP Server ยท A2A Server ยท Resilience Guide ยท Features Gallery
๐ Setup, env vars & FAQ
| Env var | Default | Purpose |
|---|---|---|
PORT | 20128 | API + dashboard port |
REQUIRE_API_KEY | false | Require API key for all requests |
DATA_DIR | ~/.omniroute | Database & config storage |
Will I be charged by OmniRoute? No โ it's free, open-source software on your machine. You only pay paid providers directly. OmniRoute has no billing system. Are FREE providers really unlimited? Mostly โ Qoder, Pollinations, LongCat, and Cloudflare are free with no per-account credit cap. Kiro is free too but capped at ~50 credits/month per account. Stack multiple free providers in a combo and auto-fallback keeps you serving for $0. Will compression hurt quality? No โ it only compresses the input; code, URLs, JSON are always protected. Does it work where AI is blocked? Yes โ 3-level proxy + 1proxy marketplace reach all 237 providers.
๐ User Guide ยท API Reference ยท Environment Config
๐ Troubleshooting
| Problem | Quick fix |
|---|---|
| "Language model did not provide messages" | Provider quota exhausted โ use a combo fallback |
| Rate limiting (429) | Add fallback: cc/claude โ glm/glm-4.7 โ if/kimi-k2-thinking |
| OAuth token expired | Auto-refreshed; if stuck, delete + re-auth in Providers |
unsupported_country_region_territory | Configure proxy in Settings โ Proxy |
| Docker SQLite locks | Use --stop-timeout 40 for clean WAL checkpoint |
| Node runtime errors | Use Node >=22.0.0 <23 or >=24.0.0 <27 |
๐ Reporting a bug? Run npm run system-info and attach system-info.txt. ๐ docs/guides/TROUBLESHOOTING.md
๐ธ Dashboard screenshots
| Page | Screenshot | Page | Screenshot |
|---|---|---|---|
| Providers | ![]() | Combos | ![]() |
| Analytics | ![]() | Health | ![]() |
| Translator | ![]() | Settings | ![]() |
| CLI Tools | ![]() | Usage Logs | ![]() |
๐ง Support & Community
๐ฌ Chat with the community โ Discord, Telegram & WhatsApp (๐ / ๐ง๐ท) links are at the top of this README.
- ๐ Website: omniroute.online
- ๐ GitHub: github.com/diegosouzapw/OmniRoute
- ๐ Issues: report a bug (attach
npm run system-infooutput) - ๐ค Contributing: see CONTRIBUTING.md or pick a
good first issue
- Runtime: Node.js 22.x or 24.x LTS (24 LTS recommended) โ
>=22.0.0 <23 || >=24.0.0 <27 - Language: TypeScript 6.0 โ 100% TypeScript across
src/andopen-sse/(zeroanyin core modules since v2.0) - Framework: Next.js 16 + React 19 + Tailwind CSS 4
- Database: better-sqlite3 (SQLite) + LowDB (JSON legacy) โ domain state, proxy logs, MCP audit, routing decisions, memory, skills
- Schemas: Zod (MCP tool I/O validation, API contracts)
- Protocols: MCP (stdio/HTTP) + A2A v0.3 (JSON-RPC 2.0 + SSE)
- Streaming: Server-Sent Events (SSE) + WebSocket bridge (
/v1/ws) - Auth: OAuth 2.0 (PKCE) + JWT + API Keys + MCP Scoped Authorization
- Testing: Node.js test runner + Vitest (21,000+ test cases across 2,586 files โ unit, integration, E2E, security, ecosystem)
- Platforms: Desktop (Electron), Android (Termux), PWA (any browser)
- CI/CD: GitHub Actions (auto npm publish + Docker Hub on release)
- Website: omniroute.online
- Package: npmjs.com/package/omniroute
- Docker: hub.docker.com/r/diegosouzapw/omniroute
- Resilience: Circuit breaker, exponential backoff, anti-thundering herd, TLS spoofing, auto-combo self-healing
๐ Getting Started
| Document | Description |
|---|---|
| User Guide | Providers, combos, CLI integration, deployment |
| Setup Guide | Full install methods, CLI tool configs, protocol setup, timeout tuning |
| CLI Tools Guide | Per-tool setup for Claude Code, Codex, Cursor, Cline, OpenClaw, Kilo, Copilot |
| Remote Mode | Drive a remote OmniRoute (VPS) from your laptop CLI via scoped access tokens |
| Claude Code Config | Point Claude Code at OmniRoute (local/remote) with launch + per-model profiles |
| Quick Start | 3-step install โ connect โ configure |
๐ง Operations & Deployment
| Document | Description |
|---|---|
| Docker Guide | Docker run, Compose profiles, Caddy HTTPS, tunnels, image tags |
| Podman Guide | Quadlet systemd integration, podman-compose, SELinux |
| VM Deployment | Complete guide: VM + nginx + Cloudflare setup |
| Fly.io Deployment | Deploy to Fly.io with persistent storage |
| Termux Guide | Run OmniRoute on Android via Termux |
| PWA Guide | Progressive Web App install, caching, architecture |
| Uninstall Guide | Clean removal for all install methods |
| Environment Config | Complete .env variables and references |
๐ง Features & Architecture
| Document | Description |
|---|---|
| Architecture | System architecture, data flow, and internals |
| Compression Guide | 7-option pipeline: off / lite / standard / aggressive / ultra / RTK / stacked |
| RTK Compression | Command-output compression, filters, trust, verify, raw-output recovery |
| Compression Engines | Caveman, RTK, stacked pipelines, dashboard/API/MCP surfaces |
| Compression Rules Format | JSON rule-pack schemas for Caveman and RTK filters |
| Compression Language Packs | Language detection and Caveman rule-pack authoring |
| Resilience Guide | Circuit breakers, cooldowns, queue, anti-thundering herd, TLS spoofing |
| Auto-Combo Engine | 9-factor scoring, mode packs, self-healing |
| Proxy Guide | 3-level proxy system, 1proxy marketplace, registry CRUD |
| Free Tiers | 25+ free API providers consolidated directory |
| Features Gallery | Visual dashboard tour with screenshots |
| Codebase Documentation | Beginner-friendly codebase walkthrough |
๐ค Protocols & APIs
| Document | Description |
|---|---|
| API Reference | All endpoints with examples |
| OpenAPI Spec | OpenAPI 3.0 specification |
| MCP Server | 95 MCP tools, IDE configs, Python/TS/Go clients |
| MCP Server Guide | MCP installation, transports, and tool reference |
| A2A Server | JSON-RPC 2.0 protocol, skills, streaming, task mgmt |
| A2A Server Guide | A2A agent card, tasks, skills, and streaming |
๐ Project & Quality
| Document | Description |
|---|---|
| Contributing | Development setup and guidelines |
| Changelog | Full per-version release history |
| Security Policy | Vulnerability reporting and security practices |
| i18n Guide | 40+ language support, translation workflow, RTL |
| Release Checklist | Pre-release validation steps |
| Coverage Plan | Test coverage strategy and 21,000+ test suite |
โญ Top Contributors
OmniRoute is shaped by a passionate open-source community. These individuals have made exceptional contributions that directly impact the quality, stability, and reach of the project. Thank you.
|
oyi77 ๐ฅ 189 commits โข +155K lines Analytics engine, SQL aggregations, proxy marketplace, test coverage |
Chris Staley ๐ฅ 70 commits โข +5.7K lines SSE stream hardening, Responses API, Gemini pagination, test regression fixes |
zenobit ๐ฅ 62 commits โข +24K lines CI/CD pipeline, i18n for 33 languages, Void Linux package, platform fixes |
R.D. & Randi ๐ 108 commits โข +30K lines Endpoints page, tunnel integrations, Docker workflows, A2A status, compression UI |
benzntech ๐ 22 commits โข +7.5K lines Electron desktop app, auto-updater, release build workflows, cross-platform CI |
herjarsa ๐ 21 commits โข +6K lines Zero-latency combos, vision-bridge auto-routing, catalog context-length, resilience 429 hints |
๐ These contributors' features, bug fixes, and infrastructure improvements are a core part of what makes OmniRoute reliable and feature-rich. Every pull request, every test case, and every i18n translation file matters. Open source is built by people like them.
How to Contribute
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
See CONTRIBUTING.md for detailed guidelines.
Releasing a New Version
# Create a release โ npm publish happens automatically
gh release create v3.8.2 --title "v3.8.2" --generate-notes
OmniRoute stands on the shoulders of giants. It started as a fork of 9router and a TypeScript port of the Go project CLIProxyAPI โ and from there, every subsystem below was inspired by an open-source project that got there first. Each one shaped a concrete piece of OmniRoute. This is our thank-you to all of them. ๐
โญ star counts as of June 2026 โ go give these projects a star.
๐งฌ Lineage & gateway
| Project | โญ | How it inspired OmniRoute |
|---|---|---|
| 9router ยท decolua | 19.0k | The original project this fork is built on โ extended here with multi-modal APIs and a full TypeScript rewrite. |
| CLIProxyAPI ยท router-for-me | 38.8k | The Go implementation that inspired this JavaScript / TypeScript port. |
| LiteLLM ยท BerriAI | 52.1k | The AI gateway whose public pricing dataset feeds our cost-tracking sync and whose provider-normalization model informed our routing. |
๐๏ธ Context & token compression โ engines
| Project | โญ | How it inspired OmniRoute |
|---|---|---|
| Caveman ยท JuliusBrussee | 78.2k | The viral "why use many token when few token do trick" project โ its caveman-speak philosophy powers our standard compression mode and 30+ filler/condensation rules. |
| RTK โ Rust Token Killer ยท rtk-ai | 67.3k | High-performance command-output compression โ inspired our RTK engine, JSON filter DSL, raw-output recovery and the stacked RTK โ Caveman pipeline. |
| headroom ยท headroomlabs-ai | 54.5k | Reversible context-compression (SmartCrusher) โ inspired our headroom engine and the ccr retrieve-marker pattern. |
| LLMLingua ยท Microsoft | 6.4k | Prompt-compression research (LLMLingua / LLMLingua-2) โ inspired our async, code-safe, fail-open llmlingua engine. |
| llmlingua-2-js ยท atjsh | 28 | The JS/ONNX port (MobileBERT / XLM-RoBERTa) used as the worker-thread backend for our LLMLingua engine. |
| Troglodita ยท Lenine Jรบnior | 16 | PT-BR token compression โ powers our pt-BR language pack: pleonasm reduction and filler removal tuned for Brazilian-Portuguese grammar. |
| ponytail ยท DietrichGebert | 68.8k | The viral "lazy senior dev" YAGNI-coder skill โ inspired our less-code Output Style: smallest-working-change steering that cuts generated code (the output-axis sibling to Caveman's terse prose). |
๐งฉ Compact formats, token research & code-aware tooling
| Project | โญ | How it inspired OmniRoute |
|---|---|---|
| TOON ยท toon-format | 24.7k | Token-Oriented Object Notation โ its columnar, header-plus-rows model shaped our tabular compaction stage. |
| GCF โ Graph Compact Format ยท Blackwell Systems | 14 | Schema-aware "JSON for LLMs" notation โ co-inspired our lossless homogeneous-array compaction with [N rows] markers. |
| token-optimizer-mcp ยท ooples | 421 | Brotli/SQLite cache + per-session context-delta โ inspired our session-dedup engine. |
| token-savior ยท Mibayy | 1.0k | Bash-output compaction + MCP profiles โ inspired our compression bail-out discipline and MCP tool-manifest reduction. |
| token-saver ยท ppgranger | 110 | Content-aware, per-file-type output compression with failure-aware bail-out โ validated our per-type dispatch and minimum-gain skip. |
| token-optimizer ยท alexgreensh | 1.5k | "Find the ghost tokens" โ its offload + recoverable-handle pattern informed our CCR offload thinking. |
| TokenMizer ยท Shweta-Mishra-ai | 2 | A session-graph + cross-turn line-dedup blueprint that informed our session-dedup design. |
| OmniCompress ยท jessefreitas | 2 | Rust columnar-JSON + content-addressed retrieve + cross-message dedup โ validated our headroom/ccr/session-dedup engine design and the cache-stable "compressed form is position-independent" invariant. |
| mcp-compressor ยท Atlassian Labs | 89 | MCP tool-schema/description compression โ informed our MCP tool-manifest cardinality reduction. |
| RepoMapper ยท pdavis68 | 181 | Aider-style repo-map ranking โ informed our repo-map / retrieval-ranking exploration. |
| quiet-shell-mcp ยท mrsimpson | 4 | Declarative shell-output reduction over MCP โ validated our declarative bash-output compaction. |
| ts-morph ยท David Sherret | 6.1k | TypeScript Compiler API toolkit โ inspired our parser-based comment removal that preserves string, template and regex literals. |
๐ง Memory & RAG
| Project | โญ | How it inspired OmniRoute |
|---|---|---|
| Mem0 ยท mem0ai | 59.8k | Universal memory layer โ its proxy-as-write/read-boundary model shaped our memory architecture. |
| Letta (MemGPT) ยท letta-ai | 23.6k | Stateful agents with tiered memory โ inspired our Context Control & Recovery (CCR) tiered model. |
| WFGY ยท onestardao | 1.8k | The ProblemMap taxonomy of 16 recurring RAG/LLM failure modes โ the shared vocabulary in our troubleshooting guide. |
๐ฐ๏ธ Traffic inspection, MITM & transparent proxy
| Project | โญ | How it inspired OmniRoute |
|---|---|---|
| llm-interceptor ยท chouzz | 48 | MITM interception/analysis of coding-assistant โ LLM traffic โ our Traffic Inspector ports its SSE merge, conversation normalization, host passthrough and secret masking (MIT). |
| ProxyBridge ยท InterceptSuite | 5.3k | Transparent per-process proxy routing โ inspired our crash-safe MITM teardown, socket idle-timeouts, /proc process attribution and TPROXY capture. |
๐ Model data, observability & UI
| Project | โญ | How it inspired OmniRoute |
|---|---|---|
| models.dev ยท SST / OpenCode | 5.6k | Open database of AI model specs, pricing and capabilities โ synced natively into our model catalog. |
| React Flow / xyflow ยท xyflow | 37.4k | The node-based graph library powering our real-time Compression Studio and Combo/Routing Studio. |
| LangGraph ยท LangChain | 36.1k | LangGraph Studio's live workflow-graph visualization inspired our Studios' real-time cascade view. |
| Langfuse ยท Langfuse | 30.1k | Its trace โ span โ generation observability model shaped our Compression Studio waterfall. |
| Kiali ยท Kiali | 3.6k | Istio service-mesh observability โ inspired our circuit-breaker badges and error-edge visuals in the Routing/Combo Studio. |
| lobe-icons ยท LobeHub | 2.2k | AI/LLM brand logos that render the provider icons across our dashboard. |
๐ก๏ธ Security
| Project | โญ | How it inspired OmniRoute |
|---|---|---|
| awesome-secure-defaults ยท tldrsec | 708 | A curated list of secure-by-default libraries that guides our security choices (Helmet.js, DOMPurify, ssrf-req-filter, safe-regex, Google Tink). |
โค๏ธ Support
OmniRoute is free and open source, built and maintained in the open. If it saves you time or money, consider supporting development:
- โญ Star the repo โ it genuinely helps visibility
- ๐ GitHub Sponsors โ fund ongoing maintenance and new providers
- ๐ Report bugs and share feedback in Discussions
๐ License
MIT License - see LICENSE for details.
โฌ Back to top ยท Built with โค๏ธ for the open-source AI community.
OmniRoute v3.8.43 ยท Node โฅ22.0.0 ยท MIT License ยท omniroute.online














