README.md
June 26, 2026 · View on GitHub
Cognee - The Open-Source AI Memory Platform for Agents
Demo . Docs . Learn More · Join Discord · Join r/AIMemory . Community Plugins & Add-ons
Cognee is the open-source AI memory platform that gives AI agents persistent long-term memory across sessions. Ingest data in any format, build a self-hosted knowledge graph, and let every agent recall, connect, and act with full context
🌐 This README is also available in: : Deutsch | Español | Français | 日本語 | 한국어 | Português | Русский | 中文
📄 Read the research paper: Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning — Markovic et al., 2025
About Cognee
Cognee is an open-source AI memory platform for AI Agents. Ingest data in any format, and Cognee continuously builds a self-hosted knowledge graph that gives your agents persistent long-term memory across sessions. Cognee combines vector embeddings, graph reasoning, and cognitive-science-grounded ontology generation to make documents both searchable by meaning and connected by relationships that evolve as your knowledge does.
:star: Help us reach more developers and grow the cognee community. Star this repo!
:books: Check our detailed documentation for setup and configuration.
:crab: Available as a plugin for your OpenClaw — cognee-openclaw
✴️ Available as a plugin for your Claude Code — claude-code-plugin
🦀 Available as a Rust client — cognee-rs
🟦 Available as a TypeScript client — @cognee/cognee-ts
Why use Cognee:
- Easily Build Company Brain - unify data from various sources in one place and enable Agents with your domain knowledge
- Knowledge infrastructure — unified ingestion, graph/vector search, runs locally, ontology grounding, multimodal
- Persistent and Learning Agents - learn from feedback, context management, cross-agent knowledge sharing
- Reliable and Trustworthy Agents - agentic user/tenant isolation, traceability, OTEL collector, audit traits
How it Works
Basic Usage & Feature Guide
To learn more, check out this short, end-to-end Colab walkthrough of Cognee's core features.
Quickstart
Let’s try Cognee in just a few lines of code.
Prerequisites
- Python 3.10 to 3.14
Step 1: Install Cognee
You can install Cognee with pip, poetry, uv, or your preferred Python package manager.
uv pip install cognee
Step 2: Configure the LLM
import os
os.environ["LLM_API_KEY"] = "YOUR OPENAI_API_KEY"
Alternatively, create a .env file using our template.
To integrate other LLM providers, see our LLM Provider Documentation.
Step 3: Run the Pipeline
Cognee's API gives you four operations — remember, recall, forget, and improve:
import cognee
import asyncio
async def main():
# Store permanently in the knowledge graph (runs add + cognify + improve)
await cognee.remember("Cognee turns documents into AI memory.")
# Store in session memory (fast cache, syncs to graph in background)
await cognee.remember("User prefers detailed explanations.", session_id="chat_1")
# Query with auto-routing (picks best search strategy automatically)
results = await cognee.recall("What does Cognee do?")
for result in results:
print(result)
# Query session memory first, fall through to graph if needed
results = await cognee.recall("What does the user prefer?", session_id="chat_1")
for result in results:
print(result)
# Delete when done
await cognee.forget(dataset="main_dataset")
if __name__ == '__main__':
asyncio.run(main())
Use the Cognee CLI
cognee-cli remember "Cognee turns documents into AI memory."
cognee-cli recall "What does Cognee do?"
cognee-cli forget --all
To open the local UI, run:
cognee-cli -ui
Note: The MCP server launched by
cognee-cli -uiruns inside a Docker container. Docker Desktop, Colima, or any OCI-compatible runtime with a workingdockerCLI is required. See Docker & Colima Setup for details.
Run with Docker
Prefer containers? Cognee publishes prebuilt images to Docker Hub on every push to main:
cognee/cognee (the API server) and
cognee/cognee-mcp (the MCP server).
Option A — Docker Compose (build from source)
Clone the repo, create a .env with at least LLM_API_KEY, then:
cp .env.template .env # then edit .env and set LLM_API_KEY
# Start the API server (http://localhost:8000)
docker compose up
# Optional profiles (combine as needed):
docker compose --profile ui up # + frontend on http://localhost:3000
docker compose --profile mcp up # + MCP server on http://localhost:8001
docker compose --profile postgres up # + Postgres/PGVector
docker compose --profile neo4j up # + Neo4j
The
cogneeandcognee-mcpservices publish different host ports (8000vs8001), so you can run both at once.
Option B — Pull the prebuilt image (no clone required)
# Create a minimal .env in the current directory
echo 'LLM_API_KEY="YOUR_OPENAI_API_KEY"' > .env
# API server
docker run --env-file ./.env -p 8000:8000 --rm -it cognee/cognee:main
# MCP server (HTTP transport)
docker pull cognee/cognee-mcp:main
docker run -e TRANSPORT_MODE=http --env-file ./.env -p 8000:8000 --rm -it cognee/cognee-mcp:main
See the MCP server README for SSE/stdio transports, optional extras, and MCP client configuration.
Use with AI Agents
Claude Code
Install the Cognee memory plugin to give Claude Code persistent memory across sessions. The plugin captures prompts, tool traces, and assistant responses into session memory, injects relevant context on every prompt, and syncs session memory into the permanent knowledge graph at session end.
Install from the Claude Code marketplace. The recommended way is from your shell, before launching Claude Code, so the first claude launch is a clean session that bootstraps memory automatically:
# Add the marketplace and install the plugin (one-time, user-scoped)
claude plugin marketplace add topoteretes/cognee-integrations
claude plugin install cognee-memory@cognee
# Set env vars for your mode (see below), then launch
export LLM_API_KEY="sk-..." # local mode; or COGNEE_BASE_URL + COGNEE_API_KEY for cloud
claude
Local mode (default) — the plugin bootstraps a local Cognee API at http://localhost:8011. Only LLM_API_KEY is required; the Cognee API key is auto-minted if absent:
export LLM_API_KEY="sk-..."
Cognee Cloud or a remote server — set both:
export COGNEE_BASE_URL="https://your-instance.cognee.ai"
export COGNEE_API_KEY="ck_..."
On startup you should see a "Cognee Memory Connected" system message.
The plugin hooks into Claude Code's lifecycle — SessionStart selects mode and sets up identity, UserPromptSubmit injects dataset-scoped context, PostToolUse captures tool traces, Stop writes the assistant's answer, PreCompact preserves memory across context resets, and SessionEnd triggers the final sync into the permanent graph.
See the plugin README for sessions, datasets, and full configuration.
Connect to Cognee Cloud
Point any Python agent at a managed Cognee instance — all SDK calls route to the cloud:
import cognee
await cognee.serve(url="https://your-instance.cognee.ai", api_key="ck_...")
await cognee.remember("important context")
results = await cognee.recall("what happened?")
await cognee.disconnect()
Examples
Browse more examples in the examples/ folder — demos, guides, custom pipelines, and database configurations.
Use Case 1 — Customer Support Agent
Goal: Resolve customer issues using their personal data across finance, support, and product history.
User: "My invoice looks wrong and the issue is still not resolved."
Cognee tracks: past interactions, failed actions, resolved cases, product history
# Agent response:
Agent: "I found 2 similar billing cases resolved last month.
The issue was caused by a sync delay between payment
and invoice systems — a fix was applied on your account."
# What happens under the hood:
- Unifies data sources from various company channels
- Reconstructs the interaction timeline and tracks outcomes
- Retrieves similar resolved cases
- Maps to the best resolution strategy
- Updates memory after execution so the agent never repeats the same mistake
Use Case 2 — Expert Knowledge Distillation (SQL Copilot)
Goal: Help junior analysts solve tasks by reusing expert-level queries, patterns, and reasoning.
User: "How do I calculate customer retention for this dataset?"
Cognee tracks: expert SQL queries, workflow patterns, schema structures, successful implementations
# Agent response:
Agent: "Here's how senior analysts solved a similar retention query.
Cognee matched your schema to a known structure and adapted
the expert's logic to fit your dataset."
# What happens under the hood:
- Extracts and stores patterns from expert SQL queries and workflows
- Maps the current schema to previously seen structures
- Retrieves similar tasks and their successful implementations
- Adapts expert reasoning to the current context
- Updates memory with new successful patterns so junior analysts perform at near-expert level
Run the Whole Memory Layer on Postgres
Graph memory traditionally means operating a stack — a graph database for relationships, a vector database for embeddings, Redis for sessions, and a relational database for metadata — all deployed, secured, and paid for before an agent remembers anything. In cognee 1.0 you can run the entire memory layer on a single Postgres instance.
| Memory layer | Traditional stack | cognee on Postgres |
|---|---|---|
| Relationships | Neo4j or another graph database | cognee's Postgres graph backend |
| Embeddings | Dedicated vector database | pgvector |
| Sessions | Redis | SQL session-cache backend |
| Metadata | Relational database | same Postgres |
The graph still exists — it just lives inside the same Postgres-backed memory layer as the text, metadata, and embeddings, so retrieval moves between similarity and structure without crossing service boundaries. In our CI benchmarks, Postgres search ran ~10% faster than the separate graph-plus-vector setup.
Postgres is the default we recommend for most deployments, but you can still swap in dedicated backends when a workload needs them (Neo4j and Neptune for graphs, Redis for sessions, pgvector and LanceDB for vectors, plus Qdrant, ChromaDB, Weaviate, and Milvus via community adapters). Local development stays fully embedded — SQLite, LanceDB, and Kuzudb — with no extra services to stand up.
pip install "cognee[postgres]"
DB_PROVIDER=postgres
VECTOR_DB_PROVIDER=pgvector
GRAPH_DATABASE_PROVIDER=postgres
CACHE_BACKEND=postgres
DB_HOST=localhost
DB_PORT=5432
DB_USERNAME=cognee
DB_PASSWORD=cognee
DB_NAME=cognee_db
Deploy Cognee
Use Cognee Cloud for a fully managed experience, or self-host with one of the 1-click deployment configurations below.
| Platform | Best For | Command |
|---|---|---|
| Cognee Cloud | Managed service, no infrastructure to maintain | Sign up or await cognee.serve() |
| Modal | Serverless, auto-scaling, GPU workloads | bash distributed/deploy/modal-deploy.sh |
| Railway | Simplest PaaS, native Postgres | railway init && railway up |
| Fly.io | Edge deployment, persistent volumes | bash distributed/deploy/fly-deploy.sh |
| Render | Simple PaaS with managed Postgres | Deploy to Render button |
| Daytona | Cloud sandboxes (SDK or CLI) | See distributed/deploy/daytona_sandbox.py |
See the distributed/ folder for deploy scripts, worker configurations, and additional details.
Use Cognee in Other Languages
Prefer something other than Python? Cognee also ships official clients for Rust and TypeScript.
Getting Started with Rust
Use the cognee-rs crate to add, cognify, and search from Rust.
cargo add cognee
See the cognee-rs repository for full setup and examples.
Getting Started with TypeScript
Use the @cognee/cognee-ts package to add, cognify, and search from Node.js or the browser.
npm install @cognee/cognee-ts
See the @cognee/cognee-ts package for full setup and examples.
Benchmarks
We ran cognee against BEAM, a long-context benchmark that tests whether a system can keep track of a long conversation as it changes — a more useful test for agent memory than typical needle-in-a-haystack benchmarks. Using only cognee's default settings and standard open-source features (no custom models, no BEAM-specific pipelines), we beat the previous state of the art at the 100K-token setting and matched it at 10M tokens.
| Benchmark | Setting | cognee | Previous SOTA | Obsidian / RAG baseline |
|---|---|---|---|---|
| BEAM | 100K tokens | 0.79 (>0.8 with per-question routing) | 0.735 | ~0.33 |
| BEAM | 10M tokens | 0.67 | 0.641 | ~0.33 |
These numbers are a directional signal rather than a definitive measure — see the write-up for the full methodology, caveats, and what the results actually mean.
Latest News
Community & Support
Contributing
We welcome contributions from the community! Your input helps make Cognee better for everyone. See CONTRIBUTING.md to get started.
Code of Conduct
We're committed to fostering an inclusive and respectful community. Read our Code of Conduct for guidelines.
Research & Citation
We recently published a research paper on optimizing knowledge graphs for LLM reasoning:
@misc{markovic2025optimizinginterfaceknowledgegraphs,
title={Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning},
author={Vasilije Markovic and Lazar Obradovic and Laszlo Hajdu and Jovan Pavlovic},
year={2025},
eprint={2505.24478},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2505.24478},
}
