Changelog
June 1, 2026 · View on GitHub
All notable changes to SarmaLink-AI are documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]
Added (v2 ten-feature drop)
- Intent auto-routing (
lib/v2/auto-route.ts). Regex pre-filter plus Groq Llama 3.3 fallback. Gated behindENABLE_AUTO_ROUTE=1. - Multi-step agent runner (
POST /api/v1/agent,lib/v2/agent-runner.ts). Planner, workers, synthesiser, SSE event stream. Caps: 5 steps, 60 s per worker. - MCP-shaped tool catalog (
POST /api/v1/mcp/catalog,lib/v2/mcp-catalog.ts). Bearer-protected viaMCP_INTERNAL_KEY. Three demo tools (current_time,random_uuid,echo) plus an extensibleToolDefregistry. - TTS cascade (
POST /api/v1/tts,lib/v2/tts.ts). Cloudflare MeloTTS first, Gemini TTS as paid fallback. Supports EN, ES, FR, ZH, JP, KR. - STT route (
POST /api/v1/stt,lib/v2/stt.ts). Groq Whisper first, Cloudflare Whisper fallback. - Live-data tool functions (
lib/tools/live.ts).getWeather(Open-Meteo),getExchangeRates(Frankfurter),getNews(Hacker News Algolia). All free, no key. - Image generation with key rotation (
POST /api/v1/images/generate,lib/v2/image-gen.ts). FLUX across up to 4 Cloudflare account/token pairs, optional R2 upload. - Quota tracker (
GET /api/v1/quota,lib/v2/quota.ts,supabase/migrations/20260601_ai_quota.sql). Per-user-today and company-wide totals by tier. - Smart suggestions (
POST /api/v1/suggestions,lib/v2/suggestions.ts). Three follow-up prompts via Groq Llama 3.3, low temperature, 120 token cap. - Reasoning-leak stripper, PDF export, XLSX export (
lib/sanitize/reasoning.ts,POST /api/v1/export/pdf,POST /api/v1/export/xlsx). Stateful stream stripper for<think>blocks; pdfkit for PDF; exceljs for XLSX (single- or multi-sheet).
Added
- Frontier provider adapters. A new
anthropicprovider type wired through Anthropic's OpenAI-compatible endpoint (with theanthropic-versionheader), so Opus 4.7 streams through the same pipeline as every other engine. Opus 4.7, GPT-5.5 (via GitHub Models) and Gemini 3.5 Pro are promoted to the head of the Smart, Reasoner, Coder and Live failover chains. Each step is skipped at runtime when its provider has no configured key, so free-only deployments fall straight through to the existing free-tier engines.ANTHROPIC_API_KEY(and_2and so on) plus optionalANTHROPIC_VERSIONare collected inlib/env/validate.ts. - Cross-provider prompt caching (
lib/providers/cache.ts). Normalises Anthropic ephemeralcache_controlbreakpoints, the OpenAI-compatibleprompt_cache_keyprefix, and Gemini implicit caching behind one pure call. The failover runner applies it to the stable system prefix, keyed on the selected mode. On by default; setENABLE_PROMPT_CACHE=falseto disable. A no-op for prompts below 1 KB. - Structured streaming protocol (
lib/streaming/events.ts). A typed discriminated union for every SSE frame (token,thinking,backend,auto_routed,image,sources,usage,done,error) with a serialiser, a validating parser, and a usage reader that normalises OpenAI and Anthropic cache-hit accounting. The runner now emits ausageframe carrying prompt-cache hits before thebackendframe. - MCP tool-call passthrough (
lib/plugins/mcp.ts,app/api/v1/mcp/). JSON-RPC 2.0tools/listandtools/callagainst any Model Context Protocol server over the Streamable HTTP transport, parsing both plain-JSON and SSE response bodies.GET /api/v1/mcp?plugin=<id>lists tools,POST /api/v1/mcpinvokes one; upstream auth is taken from the plugin's configured token env var. - Per-model cost dashboards (
lib/providers/cost.ts). A May-2026 list-price table, a pure per-turn cost function that bills cached prompt tokens at the cached rate, and an aggregator that rolls theai_eventslog into a per-model USD breakdown with a paid/free split. The admin health endpoint returns this under acostblock. - End-to-end test (
__tests__/e2e-frontier-flow.test.ts) with fixtures exercising the full failover, prompt-caching, usage and cost path. Test count is now 151, up from 110.
Changed
- README rewritten as a product page with a Mermaid architecture diagram, a one-line tagline, a three-sentence overview, the updated six-mode table, and sections for the new features.
- Reconciled the provider count across code and docs. The gateway has ten chat providers (Anthropic, GitHub Models, Gemini, Groq, SambaNova, Cerebras, OpenRouter, Cohere, Mistral, Ollama); Cloudflare and Tavily power tools rather than chat. README and wiki no longer claim "seven providers" or "36 engines".
- CI now runs on Node 24.
package.jsonversion bumped to 1.3.0 and the description updated to describe the gateway feature set.
Security
- Security disclosure address corrected to
security@sarmalinux.comand an explicit supported-versions table added toSECURITY.md. - MCP passthrough never accepts an upstream secret from the client; it reads the bearer token from the plugin's configured env var only.
[1.2.0] — 2026-05-03
Added
- Cross-repo plugin system —
lib/plugins/index.tsregisters the 10 sibling open-source repos as routable tools (voice-agent-starter, agent-orchestrator, ai-eval-runner, local-llm-router, mcp-server-toolkit, rag-over-pdf, receipt-scanner, webhook-to-email, k8s-ops-toolkit, terraform-stack). Each plugin is enabled when its endpoint env var is set; no dynamic discovery. Seedocs/PLUGINS.md. - Intent-based plugin auto-routing —
lib/services/plugin-autorouter.tsdetects task intent (research / voice / eval / workflow / rag / ocr) from user input pre-LLM and dispatches to the matching plugin endpoint when one is enabled. Gated byENABLE_PLUGIN_AUTOROUTEenv (default off). Falls through to the normal LLM path when no plugin matches. - Manus integration — full client + webhook for Manus. Submit long-running browser/research tasks from chat; results stream back in. Real HMAC-SHA256 signature verification on the webhook.
- Manus webhook persistence —
supabase/migrations/002_manus_tasks.sqlschema +lib/repositories/manus-tasks.tsrepository. Webhook upserts task state intomanus_taskstable; newGET /api/v1/manus/tasks/[id]route serves the persisted record. Forkers apply the migration in their own Supabase project and setNEXT_PUBLIC_SUPABASE_URL+SUPABASE_SERVICE_ROLE_KEYin their Vercel env. /docspage — server-rendered list of all 10 plugins with badges, env-var status, repo links, and a Manus referral CTA. The CTA is on by default and overridable per-deployment viaNEXT_PUBLIC_MANUS_INVITE_CODE.docs/MANUS.md— Manus integration guide (env vars, webhook setup, task lifecycle).docs/MAKE-IT-YOURS.md— full white-label guide. Copy-paste v0 prompt that generates a complete branded front end (home, pricing, docs, login, signup, dashboard with usage charts and API key CRUD), instructions for swapping logo/colours/copy, and the full Supabase + Vercel deploy path. Pairs with terraform-stack for one-command reproducibility.
Changed
- README updated with Manus invite link (
AIRTDVWVEWKCK4R), cross-repo plugin section, and white-label guide pointer. Forkers are explicitly told to swap the invite code for their own — seedocs/MAKE-IT-YOURS.md. - Stale self-TODOs replaced with fork-swap guidance in
MANUS.mdand the white-label guide.
[1.1.0] — 2026-04-19
Added
- GitHub Actions CI (lint, typecheck, test, build) on every push and PR
- CodeQL security scanning
- Dependabot for weekly dependency updates
- Vitest test suite with unit tests for auto-router and failover ordering
- Zod-based environment variable validation at startup
- Prompt sanitization layer — untrusted content (file extracts, search results, user memories) is wrapped in explicit boundary markers
SECURITY.md— vulnerability disclosure policyCONTRIBUTING.md— how to add providers, tools, and submit PRs- Pull request template enforcing test/typecheck/lint checks
docs/ARCHITECTURE.md,docs/DB-SCHEMA.md,docs/ENV-MATRIX.md,docs/FAILURE-MODES.md,docs/DEPLOY.md— engineering documentation- Unified intent router (
lib/router/index.ts) consolidating all detection logic into one entrypoint
Changed
- Extracted provider registry (
lib/providers/registry.ts) from monolithic route - Extracted failover runner (
lib/providers/failover.ts) from monolithic route - Extracted system prompt builder (
lib/prompts/system.ts) from monolithic route - Replaced all
(x as any)casts with typed Supabase repositories - Shortened R2 image URL expiry from 30 days to 7 days
- Image gen endpoint now validates MIME types server-side
- README Quick Start split into "Minimum Setup" (Groq + Supabase, 3 env vars) and "Full Setup" (all 7 providers) to lower onboarding friction
- Extracted chat route into focused service modules (
lib/services/) — route handler is now a thin coordinator
Fixed
.env.examplelistedTAVILY_API_KEY_1butlib/env/validate.tsexpectedTAVILY_API_KEY(no_1suffix) as the first key — aligned.env.exampleto match
Security
- SSE stream chunk validator (
sanitizeStreamChunk) — rejects non-string chunks, enforces 32KB cap, strips SSE protocol injection - Invisible character stripping (
stripInvisibleChars) — removes zero-width, bidi overrides, BOM, and soft-hyphen characters before database persistence - All user-supplied content (files, search results, memories) now wrapped in structured markers before injection into the model context
supabaseAdminusage minimised — read paths moved to anon client with RLS where possible- Environment variable validation prevents application boot with missing critical keys
[1.0.0] — 2026-04-17
Initial public release.
Added
- Multi-provider failover architecture across Groq, SambaNova, Cerebras, Google Gemini, OpenRouter, Cloudflare, and Tavily
- 6 specialised modes (Smart, Reasoner, Live, Fast, Coder, Vision) with 7–14 engine failover depth
- Auto-router detects intent from message text and routes to the right mode
- Live tools — exchange rates (ECB), weather (Open-Meteo), container tracking (9 shipping lines)
- Image generation and instruction-following editing via FLUX.2 klein
- Persistent memory — extracts user facts after conversations and injects them into future chats
- Document analysis for PDF, Excel, and Word files (up to 10 per conversation)
- Per-user daily quota enforcement
- Real-time SSE streaming with markdown rendering
- 50 saved conversations per user with oldest auto-deletion
- Supabase Auth integration
- Cloudflare R2 file storage
- MIT License