Architecture Overview

July 3, 2026 · View on GitHub

Lemon is an AI coding assistant built as a distributed system of concurrent processes running on the BEAM (Erlang VM). This document covers the system architecture, key design decisions, and component responsibilities.

For system diagrams see docs/diagrams/. For per-app details see each apps/*/README.md.


Core Philosophy

  1. Agents as Processes — each AI agent is a GenServer with isolated state, a mailbox, and an independent lifecycle. Multiple sessions never share state.

  2. Streaming as Events — LLM responses are modeled as event streams, enabling reactive UIs, parallel processing, and backpressure handling.

  3. Fault Tolerance — OTP supervision trees isolate failures. A crashing tool does not kill the agent session; a network error during streaming is recoverable.

  4. Live Steering — users can inject messages mid-execution because the BEAM can send a message to any process at any time.

  5. Multi-Provider Abstraction — unified interface for 26 LLM providers with automatic model configuration and cost tracking.

  6. Multi-Engine Architecture — pluggable execution engines: native Lemon plus Codex CLI, Claude CLI, OpenCode CLI, and Pi CLI backends.


System Architecture

┌─────────────────────────────────────────────────────────────┐
│ Clients                                                      │
│  TUI (TypeScript)  ·  Web (React)  ·  Browser (Playwright)  │
└───────────────────────┬─────────────────────────────────────┘
                        │ JSON-RPC / WebSocket
┌───────────────────────▼────────────────────┐
│ LemonControlPlane  (112+ RPC methods)       │
└───────────────────────┬────────────────────┘

┌───────────────────────▼────────────────────┐
│ LemonRouter            RunOrchestrator      │
│  · model selection     · policy enforcement │
│  · routing feedback    · approval gating    │
└────────┬──────────────────────┬────────────┘
         │                      │
┌────────▼───────┐   ┌──────────▼──────────┐
│ LemonGateway   │   │ LemonChannels        │
│  (engines)     │   │  Telegram, Discord,  │
└────────┬───────┘   │  X/Twitter           │
         │           └─────────────────────-┘
┌────────▼───────────────────────────────────┐
│ CodingAgent.Session                         │
│  · 23 built-in tools                        │
│  · context compaction                       │
│  · extension system                         │
└────────┬───────────────────────────────────┘

┌────────▼──────────────┬──────────────────┐
│ LemonCore             │ LemonSkills       │
│  · EventBus           │  · skill catalog  │
│  · MemoryStore        │  · audit engine   │
│  · RoutingFeedback    │  · synthesis      │
│  · TaskFingerprint    │  · installer      │
└───────────────────────┴──────────────────┘

┌────────▼──────────────────────────────────┐
│ Ai  (provider abstraction layer)           │
│  26 providers: Anthropic, OpenAI, Google,  │
│  Azure, AWS Bedrock, xAI, Mistral, …       │
└───────────────────────────────────────────┘

See docs/diagrams/architecture.svg for the full visual diagram.


Application Map

The project is an Elixir umbrella with 18+ applications:

AppRole
aiProvider abstraction, streaming, cost tracking
agent_coreCore agent loop, tool execution, model runtime credential glue, abort/subagent semantics
coding_agentSession management, compaction, JSONL persistence, tools
coding_agent_uiDebug RPC interface, TUI/Web bridge
lemon_coreEventBus, MemoryStore, TaskFingerprint, config
lemon_routerRunOrchestrator, ModelSelection, RoutingFeedbackStore, lane queues, policy engine
lemon_gatewayEngine dispatch (native + CLI backends), execution lifecycle
lemon_channelsTransport adapters: Telegram, Discord, X/Twitter
lemon_automationCronManager, HeartbeatManager, scheduled jobs
lemon_control_planeHTTP/WebSocket server, 112+ RPC methods
lemon_skillsSkill catalog, manifest v2 parser, installer, audit, synthesis
lemon_mcpMCP protocol server
lemon_simSimulation harness for development/testing
lemon_sim_uiPhoenix LiveView UI for simulation spectator/admin
lemon_webReact web frontend bridge

Data Flow

Four main paths through the system:

  1. Direct (TUI/Web): JSON-RPC → debug_agent_rpccoding_agent_ui → Session → AgentCore → Tools/Ai

  2. Control Plane: WebSocket → ControlPlane → Router → Orchestrator → Gateway → Engine

  3. Channel (Telegram etc.): Message → LemonChannels → Router → StreamCoalescer → Outbox

  4. Automation: CronManager tick → Due jobs → Router → HeartbeatManager → EventBus

See docs/diagrams/data-flow.svg for the full diagram.


Run Lifecycle

User message
  → Session routing (canonical session key)
    → RunOrchestrator.start_run/1
      → ModelSelection.resolve/1  (explicit → meta → session → profile → history → default)
      → Lane selection (main/subagent/background)
      → Engine dispatch
        → Tool execution (isolated Task processes)
        → LLM streaming (event stream per response)
      → Outcome recording (RunOutcome → MemoryDocument)
      → Routing feedback entry

Lane scheduling

LaneDefault capPurpose
main4User-initiated runs
subagent8Agent-spawned subagents
background2Cron jobs, automations

Model selection precedence

explicit_model        # per-message /model override
  → meta_model        # metadata field in request
    → session_model   # /model set for this session
      → profile_model # config [profiles.X] model field
        → history_model  # best model for this task fingerprint (routing_feedback)
          → default_model  # config [defaults] model

Key Abstractions

TaskFingerprint

Classifies every run into a canonical key used for routing feedback and skill synthesis:

<task_family>|<toolset>|<workspace>|<provider>|<model>

Task families: :code, :query, :file_ops, :chat, :unknown

Context key (for history lookup): <task_family>|<toolset>|<workspace>

MemoryDocument

Durable record of a completed run:

doc_id, run_id, session_key, agent_id, workspace_key, scope,
started_at_ms, ingested_at_ms,
prompt_summary, answer_summary, tools_used,
provider, model, outcome, meta

Feature Flags

All non-trivial features are gated behind flags in [features] TOML section. Code reads flags via LemonCore.Config.Features.enabled?(features, :flag_name).

Current flags: product_runtime, skills_hub_v2, skill_manifest_v2, progressive_skill_loading_v2, session_search, routing_feedback, skill_synthesis_drafts.


Why BEAM?

ConcernBEAM advantage
Millions of concurrent agentsLightweight processes (microseconds to start, ~2KB memory)
Live steering mid-runMessage to any process at any time
Tool crash isolationOTP supervision; supervisor restarts failed child
Streaming responsesProcess-per-stream with backpressure
Session persistence across restartsDurable state in ETS + SQLite
Hot code reloadBEAM code upgrade without restart
Multi-node futureNative Erlang distribution built in

Further Reading

DocumentTopic
docs/architecture_boundaries.mdDependency policy between apps
docs/beam_agents.mdBEAM agent architecture deep-dive
docs/context.mdContext management and compaction
docs/model-selection-decoupling.mdModel selection design
docs/assistant_bootstrap_contract.mdSession bootstrap sequence
apps/*/README.mdPer-application documentation

Last reviewed: 2026-05-16