README.md

July 19, 2026 · View on GitHub

Useful work with AI shouldn't disappear when a conversation ends. Wenlan builds the right pages and keeps them current as sources change, asking only when judgment is needed.

English | 简体中文 | 繁體中文

Get started · What is this? · Capabilities · Daily workflow · Evaluation · Learn more

Wenlan desktop app showing a source-backed wiki page with inspectable citations.

_{A maintained Page in the desktop app: open any citation to inspect the Source or Memory behind the claim.}

The desktop app is the fastest way to see the complete workflow: read pages, inspect their sources, and curate the knowledge system. The current macOS Apple Silicon preview is not yet notarized, so this installer verifies the GitHub release, installs Wenlan, clears quarantine for this app only, and opens it without changing macOS security settings:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/7xuanlu/wenlan/main/scripts/install-macos-app.sh)"

The installer is inspectable. It checks the release archive against GitHub's published SHA-256 before replacing an existing app. Prefer the DMG or want to inspect the app source? See wenlan-app releases and wenlan-app.

Set up with your AI

Paste this into Claude Code, Codex, or another tool that can follow a setup guide:

Set up Wenlan for this AI client by following:
https://raw.githubusercontent.com/7xuanlu/wenlan/main/docs/setup-with-ai.md

Install only what this client needs. Then verify the local runtime,
its Wenlan connection, and a capture/recall round trip.

The guide detects which client you are using and keeps client-specific commands out of this README. It does not configure every AI tool unless you ask it to.

Need only the headless runtime on macOS Apple Silicon?

npx -y wenlan setup

This downloads the prebuilt CLI, daemon, and MCP connector, starts the local runtime, and verifies it. No Rust toolchain or Cargo is required. Linux x64/ARM64 with glibc has an automated shell setup path; Windows x64 uses the matching archive from Releases. macOS Intel currently has no supported complete-runtime install.

Manual and client-specific instructions: AI-assisted setup · Claude Code plugin · Codex plugin · CLI and MCP.

What is this?

Wenlan turns documents, notes, and past AI conversations into a source-backed knowledge base that stays current as your work evolves. Sources remain traceable; decisions, lessons, and corrections become durable memories; both can support the same maintained Pages.

Sources and memories independently support a maintained Page. Wenlan can rebuild a stale Page from its current support; optional conflict review can surface protected conflicts, and changes to human writing wait for the user.

Built for work that continues. Wenlan is for researchers, writers, consultants, product teams, and software teams whose knowledge is scattered across documents, notes, and AI conversations. It turns that material into inspectable Pages that can improve across projects and weeks, not another chat history or isolated memory store. It is not a life-management system or a memory SDK embedded inside another product.

One knowledge system, three roles:

Sources keep the material Wenlan reads traceable. Imported conversations remain as captured records; registered files sync their current contents as they change.
Memories preserve what work teaches you. Agents capture atomic decisions, lessons, corrections, and supersession with provenance.
Pages compile current knowledge. Wenlan turns relevant Sources and Memories into source-cited Markdown you can reuse, refresh, and review.

The LLM-wiki foundation, extended:

LLM-wiki v1: Karpathy defined immutable Sources, an AI-maintained Markdown Wiki, and a co-evolving Schema of rules for structuring and maintaining it. Wenlan implements that foundation with typed Memory fields and built-in rules for Page structure, provenance, citations, refresh, ownership, and review.
LLM-wiki v2: Rohitg00 added a memory lifecycle. Wenlan makes that direction concrete with traceable Sources, agent-captured Zettelkasten-style atomic Memories (one complete idea each), and maintained Pages built from both.

Wenlan's distinctive move: Sources and atomic Memories independently support maintained Pages. Memory history preserves how knowledge changed; Page history shows which current evidence supports the synthesis. Machine-maintained Pages can rebuild from current support, while changes to human writing wait as reviewable revisions.

Wenlan feature reel showing source-backed pages, source inspection, graph context, agent capture, and curation.

A knowledge graph that gets more useful over time

The entity-relation graph is one part of Wenlan's wider connected wiki. Knowledge Pages hold maintained synthesis, Entities anchor reusable people, projects, and concepts, Source Pages make imported or synchronized material inspectable, and atomic Memories preserve decisions and changes. They work through separate, explicit links: Page-to-Page wikilinks, Page evidence, Memory-to-Entity links, and directed Entity relations.

Conceptual model of Wenlan's connected knowledge system, with Knowledge Pages, Source Pages, atomic Memories, and Entities connected through Page links, evidence, Memory-to-Entity links, and Entity relations.

Within the entity graph, a configured enrichment model extracts typed Entities, observations, and directed relations from Memories. Entity linking and resolution reuse existing nodes instead of treating every mention as new; each Memory keeps its Source and can link to multiple Entities. How the connected model is stored ->

Meaning and direction: Relations use a seeded vocabulary such as uses, part_of, contradicts, and replaced_by; unknown types fall back to related_to and become reviewable vocabulary proposals.
Strength and provenance: A relation can store confidence, an explanation, and its source Memory, so stronger and weaker claims remain distinguishable and inspectable.
Communities that compound: Label propagation groups Entities by relation density, weighted by the relation count between each pair. These groups can organize optional corpus summaries while Entity links add retrieval context.
Correction without erasure: Related claims, corrections, and explicit supersession stay inspectable together while original Sources and Memory history remain.

During retrieval, dense entity matching finds query-relevant entities. When eligible graph links exist, the default graph-memory stream boosts linked Memories as a third RRF signal. The path is data- and scope-dependent, and Space boundaries still apply. How the graph path works ->

Retrieval across words, meaning, and connections

Wenlan's core search is a local hybrid pipeline, not a single vector lookup. Each stage has a different job:

Exact wording — SQLite FTS5: a full-text index finds literal terms, identifiers, and phrases.
Similar meaning — FastEmbed + Qdrant/bge-base-en-v1.5-onnx-Q: a quantized English model creates 768-dimensional embeddings; libSQL cosine DiskANN indexes them for approximate nearest-neighbor retrieval.
Combined ranking — weighted RRF (k = 60): lexical and semantic rank lists are fused without pretending their raw scores share a scale; cosine similarity also weights the vector contribution.
Connected context — graph-memory stream: eligible entity links add a third RRF signal while the active read scope still filters returned Memories.
Optional precision — cross-encoder reranking: unlike embeddings, jinaai/jina-reranker-v1-turbo-en or BAAI/bge-reranker-base reads each query-candidate pair and reorders the smaller pool; reranking is off by default.

Page, episodic, and fact channels are opt-in and degrade to the remaining search signals if unavailable. Space still limits the read scope. Methods, defaults, and limitations ->

Two lifecycles, one maintained knowledge system

A generated wiki can go stale; a memory store can fragment into disconnected facts. Wenlan links two lifecycles without collapsing them into one layer.

An earlier memory remains linked after an explicit superseding capture. When a Page is stale, Wenlan rebuilds it from current Sources and Memories, records the revision, and stages changes to human writing for review.

Atomic Memory

CAPTURE -> CLASSIFY -> ENRICH -> LINK -> RECONCILE

Capture and explicit supersession are core. Model-backed stages run only when the matching model is configured, and the reconcile pass is off by default.

Operation	What Wenlan does
Capture	Agents write one complete, self-contained idea per Memory, following the Zettelkasten atomic-note principle instead of saving the whole conversation.
Classify	With the on-device model, Wenlan assigns `identity`, `preference`, `decision`, `lesson`, `gotcha`, or `fact`; a precise type supplied by the caller remains authoritative.
Enrich	With the on-device model, adds structured fields, retrieval cues, event dates, quality, importance, and tags when available.
Link	Retains provenance and, when enrichment is enabled, connects Memories to entities and relations in the knowledge graph.
Reconcile	Explicit replacements preserve a `supersedes` chain. An optional on-device pass can queue protected conflicts for review instead of overwriting history; it is off by default and must be explicitly enabled.

Advanced configuration: set WENLAN_ENABLE_DUAL_POOL_RESOLVE=1 to enable that reconcile pass.

Maintained Page

DISTILL -> CITE -> TRACK -> REFRESH -> REVIEW

Operation	What Wenlan does
Distill	Compiles related Sources and Memories into one Markdown Page.
Cite	Retains citation records and verification status; automatic refresh discards a draft when its citation-support check fails.
Track	Records which evidence supports the Page, why it became stale, and a bounded changelog.
Refresh	When a Page is marked stale, rebuilds the eligible machine-maintained Page from current evidence.
Review	Turns changes to a Page you edited into a proposed revision instead of a silent rewrite.

For example, import a design document and capture a debugging decision in Codex. Wenlan can compile one Page that cites both. When that Page is refreshed, it rebuilds from its current support; if you have edited it, the proposed change waits for review.

Local Markdown that works with Obsidian

Your durable synthesis remains ordinary files rather than a proprietary editor format:

Plain files: Pages and session notes stay as Markdown under ~/.wenlan/.
Inspectable history: Distill and handoff workflows can commit logical file batches to a local git repository.
Obsidian coexistence: Wenlan reads an existing vault as a source. Symlink ~/.wenlan/pages/ into the vault or export a Page from the desktop app; your edits remain human-owned, and later machine refreshes become reviewable revisions.

The local history is directly inspectable:

$ git -C ~/.wenlan log --oneline
a1b2c3d distill: 4 pages
9f8e7d6 session: embedding-work

Capabilities

Chat import: Bring in ChatGPT or Claude export ZIPs; Wenlan automatically skips conversations already imported.
Document Sources: Ingest one .md, .txt, or text-extractable .pdf file; recurse through a folder of them; or index Markdown from an Obsidian vault.
Incremental sync: Regular file and folder Sources track changes in the background; Obsidian vaults stay read-only and resync on demand.
Atomic Memory: MCP clients save one complete decision, lesson, correction, preference, or fact, with provenance and supersession recording where it came from and what it replaces.
Typed enrichment: A configured model classifies each Memory, then adds the structured fields defined for its type, plus dates, tags, retrieval cues, and graph links.
Source-backed Pages: Distill related Sources and Memories into Markdown Pages with source references and [[wikilinks]]; the daemon can verify and record per-claim citations.
Citation-gated refresh: Automatic refresh rejects citation-poor drafts; machine Pages update while human edits become reviewable revisions.
Hybrid retrieval: FTS5 finds exact words, local BGE embeddings find meaning, and RRF fuses their ranks; graph links can add context.
Retrieval channels: Optional Page, episodic, and per-fact channels widen recall; cross-encoder reranking can improve precision.
Knowledge graph: Typed entities, relations, and observations connect people, projects, claims, and supporting Memories.
Human-in-the-loop review: Routine work stays automatic; protected conflicts, Page revisions, entity merges, and new vocabulary wait for judgment.
Spaces: Keep work, personal, client, and repository knowledge inside an explicit retrieval scope.
Local daemon + MCP: One lightweight Rust daemon remains the local source of truth. The desktop app and CLI call it directly; AI clients use small MCP connectors to reach the same knowledge.
Custom integrations: The localhost HTTP API accepts prepared text, webpage content, and Memories from other capture workflows.
Background maintenance: The daemon keeps working after the desktop app closes, running configured sync, enrichment, citation work, and eligible Page refresh.
Model choice: Base retrieval stays local; enrichment and synthesis can use on-device Qwen, a local endpoint, or a configured cloud model.
Inspectable ownership: Memories and graph data stay in local libSQL; Markdown, citations, revisions, git history, and Obsidian exports remain inspectable.
Read-only health checks: doctor verifies the runtime; lint finds malformed citations, orphan links, broken embeddings, and search-index or graph integrity problems without rewriting knowledge.

Daily workflow

The system above becomes a small daily loop: start with relevant knowledge, capture what matters while you work, close with a handoff, and let Wenlan refine what should return next time. Each pass leaves the same knowledge base sharper instead of creating another disconnected history.

The loop has four steps:

Find current knowledge. Open a relevant Page, search, or use /recall <query>; /brief [topic] can optionally assemble a broader session-start snapshot. Clients without plugin commands use the equivalent page, search, recall, and context tools.
Capture and find knowledge while you work. /capture <thing> saves a decision, lesson, gotcha, or fact with its source. /recall <query> retrieves only what is relevant instead of loading your whole history.
Close the loop. /handoff records what changed, what remains open, and where the next session should continue.
Keep the wiki current. /distill deliberately creates or refreshes pages. Between sessions, optional model-backed passes can enrich captures, connect related entities, and refresh eligible pages. /lint checks knowledge health; /curate brings proposed revisions and any conflict-review items created by the optional reconcile pass to you.

Models and privacy

Local base retrieval: The BGE embedding model runs through FastEmbed on your machine for hybrid search and needs no API key.
Optional on-device synthesis: Enrichment and Page synthesis can use user-selected Qwen3 4B or Qwen3.5 9B through llama.cpp. Wenlan does not download or activate a language model until you choose one.
Other providers: An OpenAI-compatible local endpoint such as Ollama or LM Studio, or a configured cloud provider, can supply model-backed enrichment and synthesis instead.
Cloud disclosure: If the model endpoint you select is remote, Wenlan sends that task's system and user prompts to it. Local retrieval and on-device synthesis stay on your machine.
No telemetry: Wenlan sends no telemetry.

Full workflow reference: plugin/skills. Technical model roles: technical foundations.

Evaluation

This is a retrieval-only snapshot, not a claim about end-to-end answer quality. Method, environment receipts, and the update workflow live in docs/eval.

Benchmark	Recall@5	MRR	NDCG@10
LME_Oracle (500 Q)	93.6%	0.857	0.883
LME_S (deep, 90 Q)	87.7%	0.815	0.822

Learn more

More detailed documentation, concepts, and comparisons:

Docs

Get started: install and verify the first local loop.
Daily workflow: brief, capture, recall, handoff, distill, lint, and curate.
MCP clients: connect Claude Code, Codex, Cursor, Claude Desktop, and other clients.

Concepts

Why a living wiki, not just AI memory: the problem and product model in depth.
MCP memory server: how Wenlan exposes knowledge across AI tools.
Local-first AI memory: data, privacy, and control.
Markdown and local index: storage, retrieval, and ownership.
AI agent handoff loop: carrying work cleanly into the next session.

Comparisons

Contributing

Bug fixes, eval cases, docs, and features are welcome. Installing Wenlan does not require building from source. For local development, run each group from the root of the named repository:

# 7xuanlu/wenlan — runtime, CLI, and MCP
cargo build --workspace
cargo test --workspace

# 7xuanlu/wenlan-app — desktop app
pnpm install
pnpm tauri dev
pnpm build:all

Use pnpm dev:all in the app repository when you want a fresh daemon-plus-app sequence. See this repository's AGENTS.md and CONTRIBUTING.md, plus wenlan-app's AGENTS.md, for the complete development workflow. Security reports: SECURITY.md. Please also read the Code of Conduct.

License

Wenlan is licensed under Apache-2.0. This includes the local runtime, CLI, MCP server, shared types, and Claude Code/Codex plugin files in this repository.

Lineage and peers

Wenlan (文瀾) takes its name from 文瀾閣, an imperial library that held 四庫全書 as part of one of China's largest book collections.

Wenlan's llm-wiki v2 model is its own product direction, informed by the LLM-wiki and agent-memory lineages:

Karpathy's LLM-wiki note established the raw-source-to-maintained-wiki pattern.
Rohitg00's LLM Wiki v2 proposal extends that pattern with memory lifecycle, confidence, graph, and retrieval mechanisms. agentmemory is its concrete agent-memory implementation.
nashsu/llm_wiki is a full desktop implementation of the document-centered LLM-wiki pattern.
basic-memory, obsidian-mind, mcp-memory-service, Memoria, and OpenMemory explore adjacent local knowledge and agent-memory shapes.