Cognee-RS
June 26, 2026 · View on GitHub
Cognee-RS — Rust AI Memory
On-device AI memory in Rust. Turn raw text, files, and URLs into a
persistent, queryable memory. Cognee-RS built to boot up fast (350ms) and do fast searches (260ms) and
to be a drop-in companion to the Python cognee SDK.
How it works
The four-verb memory API (remember / recall / improve / forget)
remember ingests and builds the graph; recall auto-routes retrieval over it.
Quick Start
The fastest way in is the cognee-cli binary and its memory API: remember
what you know, recall when you need it, improve the graph from
feedback, forget what's stale.
Prerequisites
- Rust toolchain — install rustup; the repo's pinned
toolchain (Rust 1.90, declared in
rust-toolchain.toml) is selected automatically. The workspace is edition 2024 / resolver 3; MSRV is 1.89. - An LLM API key (OpenAI-compatible).
Build the CLI from source with Cargo.
Build
cargo build --release # -> target/release/cognee-cli
The default feature set wires the fully embedded, no-external-service stack: SQLite (relational), Ladybug (graph), and Lancedb.
# put it on your PATH for the snippets below
export PATH="$PWD/target/release:$PATH"
Configure the LLM
A .env file in the working directory is auto-loaded. The only required
setting is the LLM API key:
export LLM_API_KEY="sk-..." # canonical name (OPENAI_TOKEN is an accepted alias)
# optional overrides:
export LLM_MODEL="openai/gpt-5-mini" # the compiled default is openai/gpt-5-mini
export LLM_ENDPOINT="https://..." # alias: OPENAI_URL; empty -> OpenAI's API
To run embeddings fully localy, set EMBEDDING_PROVIDER=onnx (or ollama).
Fully local with Ollama (LLM via Ollama, embeddings local):
ollama serve &
ollama pull llama3.2:3b
export OPENAI_URL=http://localhost:11434/v1
export OPENAI_TOKEN=not-needed # dummy value still required — the LLM client checks for a non-empty key
export OPENAI_MODEL=llama3.2:3b
export EMBEDDING_PROVIDER=ollama # or onnx — otherwise embeddings still call OpenAI
Your first memory
# store, then ask — this is the whole loop
cognee-cli remember "Cognee turns raw data into a queryable knowledge graph."
cognee-cli recall "what does cognee do?"
remember ingests, builds the knowledge graph, and runs a self-improvement pass
(disable with --no-improve). recall auto-routes the search type for you when
--query-type is omitted.
Session memory — scope facts to a transient conversation cache instead of the permanent graph:
cognee-cli remember "we ship Friday" --session-id chat-42
cognee-cli recall "when do we ship?" --session-id chat-42
Forget — clean up (exactly one target is required):
cognee-cli forget --all # everything you own
cognee-cli forget -d main_dataset # one whole dataset
cognee-cli forget --data-id <uuid> --dataset-name main_dataset # one item
Improve — nudge the graph from feedback:
cognee-cli improve -d main_dataset --node-name "Cognee" --feedback-alpha 0.1
Memory API at a glance
| Command | Purpose | Shared flags |
|---|---|---|
remember <data…> | add + cognify (+ improve) | -d/--dataset-name, --session-id, --no-improve, --tenant-id |
recall <query> | smart search (auto-routes) | -d/--datasets, -t/--query-type, -k/--top-k (10), --session-id, -f/--output-format |
improve | reinforce graph from feedback | -d/--dataset-name, --session-id, --node-name, --feedback-alpha (0.1), --tenant-id |
forget | delete memory | one of --all / -d / --data-id+--dataset-name, --tenant-id |
Flags are not uniform across subcommands: only
recall/searchaccept-t/--query-type,-k/--top-k, and-f/--output-format;forgethas no-kor--session-id. Mixing them across subcommands fails clap parsing.
CLI config is also persisted at
~/.config/cognee-rust/config.json(viacognee-cli config set). Precedence is defaults < JSON config < env vars — explicit env vars always win, but a staleconfig.jsoncan override.env-implied defaults.
Using it from Rust
The library crates are published on
crates.io. Depend on the top-level
cognee-lib crate:
cargo add cognee-lib # or add `cognee-lib = "0.1"` under [dependencies]
For local development against the in-repo sources, point the dependency at a
path instead: cognee-lib = { path = "crates/lib" }.
There is a high-level one-call API — cognee_lib::prelude::remember() /
recall() / forget() / improve() — that mirrors the Python functions.
Be aware: these are not self-contained. Each takes a set of pre-built
components (pipelines, LLM, storage, graph DB, vector DB, embedding engine,
session manager, …) — as Arc<dyn …> handles (remember/improve) or borrowed
references to already-wired orchestrators (recall/forget), so you must wire
the component graph first. They are "one call" only after the wiring.
The lowest-friction wiring root is ComponentManager, which lazily builds the
engines from env/Settings:
use cognee_lib::ComponentManager;
use cognee_lib::config::ConfigManager;
let cm = ComponentManager::new(ConfigManager::from_env());
let storage = cm.storage().await?;
let database = cm.database().await?;
let graph_db = cm.graph_db().await?;
let vector_db = cm.vector_db().await?;
let embedding = cm.embedding_engine().await?;
let llm = cm.llm().await?;
From there you build an AddPipeline and a SearchOrchestrator (via
SearchBuilder) and call add(...) / cognify(...) /
orch.search(&SearchRequest{..}). See examples/add_example.rs,
examples/cognify_example.rs, and
crates/bindings-common/src/services.rs (CogneeServices::build) for the
canonical wiring.
Language Bindings
The convenient Cognee class — new(settings) → warm() → remember() /
recall() (or the lower-level add() / cognify() / search()) — is exposed
by the bindings, not the raw Rust crate. warm() resolves owner_id and builds +
caches the component graph once, giving you the wiring-free experience the
pure-Rust path lacks. All three bindings share the same SDK-tier implementation
via crates/bindings-common/, so their surfaces line up 1:1.
| Binding | Install | README | Primary API |
|---|---|---|---|
| JavaScript/TypeScript (Neon) | npm install @cognee/cognee-ts (npm) | ts/README.md | import { Cognee } from '@cognee/cognee-ts' |
| Python (PyO3) | build from source (maturin develop) — not yet on PyPI | python/README.md | from cognee_py import Cognee |
| C API (FFI) | build from source — see README | capi/README.md | #include "cognee_sdk.h" + cg_sdk_* |
Objectives
- Small-model support: run with on-device models (Phi4 class + embeddings).
- 90+ correctness: keep the basic cognee ability to reach similar correctness to the Python Cognee SDK (90+%).
- On-device vs Cloud ability: transformation tasks + orchestration design support on-device and cloud mode.
- Multimodal support: the implementation supports multimodal data ingestion.
- Retrieval: optimally 0.6 sec on a reasonably sized knowledge base.
Orchestration requirements
- Memory Control: control over the memory used by the ingestion pipeline.
- CPU control: control over threads and parallelization in the ingestion pipeline.
- Autonomous task scheduling: dynamic scheduling of memory-tasks.
Graph Backend Concurrency
For file-backed graph storage, Python's reference implementation documents a default single-owning-process model for SQLite/Ladybug/LanceDB access, while also supporting an opt-in Redis-backed shared Ladybug lock for multi-process coordination. Rust currently matches that default model: Ladybug writes are idempotent and serialized in-process, but cross-process locking is intentionally out of scope.
Running Tests
cargo test --workspace
For local full-suite execution (including LLM and ONNX/tokenizer dependent tests), use:
# Run the whole workspace (downloads embedding models if missing,
# single-threaded for LLM isolation):
bash scripts/run_tests_with_openai.sh
# Or a single test by name:
bash scripts/run_tests_with_openai.sh test_fact_extraction
Observability
Cognee emits OpenTelemetry traces from every pipeline stage. To export them to an OTLP collector:
cargo build --release --features telemetry
OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp.your-collector:4317 \
cognee-cli search --query "what did we ingest yesterday?"
See docs/observability/opentelemetry.md
for the full guide (env vars, recipes for Grafana Tempo, Honeycomb, Dash0,
and in-cluster Collectors).
Logging
Cognee writes structured logs to stdout and (when a writable directory is
available) to a rotating file, owned by the cognee-logging
crate (cognee_logging::init_logging, called by the CLI and HTTP server). The
full env-var table (COGNEE_LOG_*, RUST_LOG/LOG_LEVEL, LOG_FILE_NAME) is
documented in Configuration → Logging.