README.md

July 1, 2026 · View on GitHub

SpineDigest

English | 中文

npm version License: Apache 2.0 Node >=22.12.0

SpineDigest Terminal Demo

SpineDigest is a knowledge-base CLI optimized for AI agents. It imports EPUB, Markdown, and plain text into .wikg, can use LLMs to extract knowledge graphs and summaries, then exposes the archive as a searchable, browsable, readable, source-backed, graph-navigable, context-packable LLM Wiki.

It is not a one-shot book-to-summary converter. Summaries, EPUB, Markdown, and JSON output are projections of the .wikg knowledge archive. The primary object is .wikg itself: a portable knowledge archive that can be built, maintained, searched, and reused.

There are three main ways to explore a .wikg archive:

  • Search mode: use search to discover URI-addressable source, summary, chunk, entity, and triple objects.
  • Structure mode: use wkg://.../chapter/tree get --json for the table-of-contents hierarchy, then list or scoped search to inspect local object collections.
  • Reading mode: use get on source, chapter, summary, chunk, entity, or triple URIs after selecting the relevant object.

Together, these modes let long documents behave like navigable knowledge bases: start with structure, locate relevant content, then return to source text and knowledge nodes for deeper reading.

Inkora screenshot

Inkora opening a .wikg file

Install

Requirements:

  • Node >=22.12.0
  • For LLM-backed Reading Graph, Reading Summary, or Knowledge Graph jobs: a supported LLM provider plus credentials
  • For .wikg search, reading, navigation, and export: no LLM access required

Try it without a global install:

npx spinedigest --help

Global install:

npm install -g spinedigest

To explore the CLI surface first, start with:

wikigraph --help
wikigraph help overview
wikigraph help ai

Quick Start

SpineDigest's primary object is .wikg: a CLI-managed knowledge-base archive, not a one-off export result.

Create a knowledge base from source material:

wikigraph wkg://book.wikg create ./book.epub
cat ./article.md | wikigraph wkg://article.wikg create --input-format markdown

Inspect and estimate before expensive work:

wikigraph wkg://book.wikg/state get
wikigraph wkg://book.wikg/chapter/tree get --json
wikigraph wkg://book.wikg estimate --stage reading-summary

Build derived knowledge when you intend to spend LLM time:

wikigraph wkg://book.wikg/chapter/12 queue add --task reading-graph --accept-cost
wikigraph wkg-job://<job-id> watch --jsonl

Search, browse, and read through the knowledge-base interface:

wikigraph wkg://book.wikg/chapter/tree get --json
wikigraph wkg://book.wikg/chunk search "RAG"
wikigraph wkg://book.wikg/chapter/12/source search "exact source phrase"
wikigraph wkg://book.wikg/chapter/12 get
wikigraph wkg://book.wikg/chunk/84 get
wikigraph wkg://book.wikg/chunk/84 related
wikigraph wkg://book.wikg/chunk/84 evidence
wikigraph wkg://book.wikg/chunk/84 pack --budget 5000

Output a projection only when you need a portable view. For example, read one chapter into Markdown text, or export the full archive as an EPUB:

wikigraph wkg://book.wikg/chapter/12/source get > ./chapter-12.md
wikigraph wkg://book.wikg export --output-format epub --output ./digest.epub

Cost rule:

Create is cheap.
Estimate before queueing Reading Graph, Reading Summary, or Knowledge Graph jobs.
Queue Reading Graph, Reading Summary, or Knowledge Graph jobs only when the cost and wait time are acceptable.
Search, get, related, evidence, pack, and export are cheap after build.

Full flag reference: CLI Reference.

Why We Built This

Knowledge bases are useful for long documents because they turn material into a structure you can re-enter: inspect the table of contents, find concepts, and return to evidence instead of stuffing everything into one context window. The problem is that knowledge bases usually require people to define page boundaries, concept relationships, and source references. Books are the most familiar long documents; if we can Wiki-ify a book, EPUB, Markdown, and plain text can enter the same knowledge-base workflow.

That is why SpineDigest started with the problem of whole books. People often say an LLM cannot really read a whole book because the context window is not long enough. But human short-term memory holds only 7 +/- 2 items (Miller's Law), far less than any modern LLM context window. Humans still read whole books, move back and forth with questions, build structures in their heads, and answer from those structures.

The bottleneck is not just window size. It is how working memory is organized.

If you put a whole book directly into context, what you get is a very long text stream. It can be summarized on the fly, searched by keyword, or sliced into excerpts, but it is hard to answer stable structural questions: which concepts belong together, where a claim came from, how two chapters relate, and which source passages support a knowledge point. Longer context does not make those problems disappear. It makes structure more necessary.

SpineDigest's goal is to turn long documents into external working memory.

First, an LLM reads the source text section by section, simulating how human attention is drawn to important ideas. It extracts a set of chunks. A chunk is not the final summary; it is an attention landing point, an independent knowledge unit that can be cited, traced, and recombined later.

Next, a classical algorithm takes over. I build a knowledge graph with chunks as nodes, connect them by conceptual relevance, then use graph traversal and community detection to cluster semantically related chunks. Each cluster is serialized in original reading order into what I call a snake: a knowledge chain that moves through the source text and links dispersed but related ideas.

Finally, the LLM returns to work on that structure. The old use case compressed those structures into a summary; the more important use now is to save them into .wikg. Later, you can use it like a Wiki: open chapter and chunk objects, trace source evidence, follow related objects, and pack an evidence-bounded context before answering.

Every professor holds a snake.

Picture a dissertation defense. The respondent stands at the front. The professors sit around the table. Each professor holds one knowledge chain and keeps reminding the respondent: this has evidence, that has a relationship, and this concept should not be mixed with that one. In the old story, the endpoint was a fairer summary. Now, the endpoint is a reference room you can enter again and again. You do not need to remember the whole book at once; you can call the relevant professors back, follow their chains to the evidence, and then compose your answer.

SpineDigest architecture

Your intent still runs through the whole process. During build, the prompt influences which knowledge units receive attention. During retrieval, the task decides whether to inspect structure first, search keywords first, or read source fragments first. The same .wikg can serve different questions: a timeline today, a concept map tomorrow, a writing context pack later. The knowledge base is not a one-shot answer. It is an interface for repeated reading, locating, and reuse.

The .wikg Format

.wikg is the core SpineDigest knowledge-base archive. It holds source-derived chapter pages, graph nodes, evidence pointers, summaries, and metadata, then exposes them through the CLI as an LLM Wiki.

With that archive on hand, you can search and navigate the knowledge structure directly:

wikigraph wkg://book.wikg/chapter/tree get --json
wikigraph wkg://book.wikg/chapter/12/chunk list
wikigraph wkg://book.wikg/chunk search "central argument"
wikigraph wkg://book.wikg/chapter/12 get
wikigraph wkg://book.wikg/chapter/12/source get

Markdown, EPUB, txt, and JSON-style outputs are projections of the archive. They are useful for portability and reading, but they do not replace the .wikg object when graph links and source fragments matter.

To open a .wikg file, use Inkora. It is a free app built specifically for .wikg, with chapter topology and knowledge graph views.

The internal layout and parser guidance live in the format spec.

Direct Transform

If you only need a one-shot digest or format conversion, use transform. It does not leave a reusable .wikg knowledge base unless you explicitly choose --output-format wikg.

cat chapter.txt | wikigraph transform --input-format txt --output-format markdown
wikigraph transform --input book.epub --output digest.md --output-format markdown

This mode is for pure conversion tasks. If the material will later be searched, navigated, traced to evidence, or built further, create a .wikg archive first.

Library Usage

SpineDigest also exposes a programmatic API for embedding lower-level import, build, and export flows in your own Node or TypeScript code. The CLI is still the most complete knowledge-base interface. See Library Usage for non-CLI integration.

  • PDF Craft: If your source material is a scanned PDF, PDF Craft can convert it into EPUB or Markdown before you import it into a SpineDigest knowledge base.
  • EPUB Translator: If your goal is bilingual reading rather than building a knowledge base, EPUB Translator turns an EPUB into a bilingual edition while preserving the original layout.

For AI Agents

SpineDigest's CLI-first design exposes .wikg as a managed LLM Wiki archive.

  • Treat .wikg as the primary object. Use archive commands before unpacking or inspecting internals.
  • Choose an exploration mode first. For synthesis and structural understanding, start with wkg://.../chapter/tree get --json; use search for candidate discovery and exact wording; use get for continuous prose after selecting the relevant URI.
  • Use help as the discovery surface. Start with wikigraph --help as the root page, then follow wikigraph help overview, wikigraph help ai, topic pages, or command-specific --help before guessing behavior.
  • Prefer --json. Use it when composing with tools.
  • Estimate before queueing jobs. Do not queue broad Reading Graph, Reading Summary, or Knowledge Graph work without wikigraph <archive-uri> estimate.
  • Check exit codes. Success returns 0; failure returns non-zero with a plain-text error on stderr.
  • Do not inspect database.db routinely. Use search, list, get, and graph navigation commands instead.

Useful help entry points:

wikigraph help overview
wikigraph help ai
wikigraph help task
wikigraph help config
wikigraph help env
wikigraph help config-file
wikigraph help command

Full agent guidance: AI Agent Guide.