README.md

May 10, 2026 · View on GitHub

GraphRAG-SDK

The simplest, most accurate GraphRAG framework built on FalkorDB

Benchmark-leading accuracy · FalkorDB-fast · Multi-tenant · Graph traversal · 5-minute setup

knowledge-graph-construction-b

Most GraphRAG systems work in demos and break under production constraints. GraphRAG SDK was built from real deployments around a simple idea: the retrieval harness matters more than the model. The result is a modular, benchmark-leading framework with predictable cost and sensible defaults that gets you from raw documents to cited answers in under 5 minutes.

Benchmarks

Rank	System	Novel (Multi-Doc)	Medical (Single-Doc)	Overall
1	FalkorDB GraphRAG SDK ◄	63.73	75.73	69.73
2	G-Reasoner	58.94	73.30	66.12
3	AutoPrunedRetriever	63.72	67.00	65.36
4	HippoRAG2	56.48	64.85	60.67
5	Fast-GraphRAG	52.02	64.12	58.07
6	RAG (w rerank) (Vector RAG)	48.35	62.43	55.39
7	LightRAG	45.09	62.59	53.84
8	HippoRAG	44.75	59.08	51.92
9	MS-GraphRAG (local)	50.93	45.16	48.05

Overall ACC on GraphRAG-Bench Novel (20 novels, 2,010 questions) and Medical (1 corpus, 2,062 questions) datasets. FalkorDB scored with gpt-4o-mini (Azure OpenAI); competitor numbers are from the published leaderboard. Overall = mean of Novel and Medical ACC. See docs/benchmark.md for per-category breakdowns, methodology, and reproduction instructions.

Vectors match similar chunks. The graph traverses relationships. Every answer cites its source.

Quick Start

1. Install and start FalkorDB

pip install graphrag-sdk[litellm]
docker run -d -p 6379:6379 -p 3000:3000 --name falkordb falkordb/falkordb:latest
export OPENAI_API_KEY="sk-..."

For PDF ingestion, install the pdf extra instead: pip install graphrag-sdk[litellm,pdf]. Ingestion sanitizes unsupported control characters in IDs and string properties before graph upserts, which helps avoid FalkorDB Cypher parse errors on noisy PDFs.

2. Ingest a document

import asyncio
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder

async def main():
    async with GraphRAG(
        connection=ConnectionConfig(host="localhost", graph_name="my_graph"),  # graph_name = per-tenant isolation
        llm=LiteLLM(model="openai/gpt-5.5"),
        embedder=LiteLLMEmbedder(model="openai/text-embedding-3-large", dimensions=256),
    ) as rag:
        # Ingest raw text (pass a file path with the `pdf` extra installed for PDFs)
        result = await rag.ingest(
            text="Alice Johnson is a software engineer at Acme Corp in London.",
            document_id="my_doc",
        )
        print(f"Nodes: {result.nodes_created}, Edges: {result.relationships_created}")

        # Finalize: deduplicate entities, backfill embeddings, create indexes
        await rag.finalize()

        # Full RAG: retrieve + generate
        answer = await rag.completion("Where does Alice work?")
        print(answer.answer)

asyncio.run(main())

3. Define a schema (optional)

from graphrag_sdk import GraphSchema, EntityType, RelationType

schema = GraphSchema(
    entities=[
        EntityType(label="Person", description="A human being"),
        EntityType(label="Organization", description="A company or institution"),
        EntityType(label="Location", description="A geographic location"),
    ],
    relations=[
        RelationType(label="WORKS_AT", description="Is employed by", patterns=[("Person", "Organization")]),
        RelationType(label="LOCATED_IN", description="Is situated in", patterns=[("Organization", "Location")]),
    ],
)

async with GraphRAG(
    connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
    llm=LiteLLM(model="openai/gpt-5.5"),
    embedder=LiteLLMEmbedder(model="openai/text-embedding-3-large", dimensions=256),
    schema=schema,
) as rag:
    ...  # ingest / completion as above

→ Full walkthrough: Getting Started
→ Benchmark-winning recipe: Custom Strategies

document-to-provenance-answer-flow-v1

Incremental Updates (v1.1.0)

Re-sync individual documents without rebuilding the graph. The canonical CI use case is updating the graph on PR merge — added, modified, and deleted files in one batch:

async with GraphRAG(connection=ConnectionConfig(...), llm=..., embedder=...) as graph:
    result = await graph.apply_changes(
        added=["docs/new_feature.md"],
        modified=["docs/api.md"],
        deleted=["docs/removed_page.md"],
    )
    await graph.finalize()  # once per batch — finalize is O(graph size)

    # Per-file outcomes are wrapped in BatchEntry — the batch never raises.
    for entry in result.added + result.modified + result.deleted:
        if not entry.is_success:
            print(f"failed: {entry.error_type}: {entry.error}")

The three primitives behind the wrapper:

Method	When to use
`update(source, document_id=...)`	Document content changed. SHA-256 hash short-circuits no-op updates (touch-only PRs cost ~1 Cypher query). Pass `if_missing="ingest"` for upsert semantics.
`delete_document(document_id)`	Document removed. Cleans up entities orphaned by the deletion; preserves entities still referenced by other documents.
`apply_changes(added=..., modified=..., deleted=...)`	Heterogeneous batch. Per-file errors are collected, not raised. Does not call `finalize()` — caller drives that cadence.

In file mode, document_id defaults to os.path.normpath(source) so update("docs/x.md") matches the original ingest("docs/x.md") with no extra plumbing. See examples/07_incremental_updates.py.

Cost model. finalize() runs cross-document deduplication, which scans the full entity table — its cost is O(graph size), not O(change size). Embedding backfill within finalize() is O(change size) (only nodes/edges missing embeddings get touched). For CI use cases, batch all PR changes through apply_changes and call finalize once at the end of the run, not per file — per-file finalize multiplies the dedup constant by the number of files touched.

Crash safety. update() uses an idempotent rollforward cutover: the new content is written to a __pending__ Document, then a single atomic Cypher statement marks ready_to_commit=true, then the live document is replaced. A crash before the marker discards the pending on retry; a crash after the marker rolls forward to completion. Either way, retrying the same update() call is safe and converges on the correct final state.

Concurrency. apply_changes exposes two knobs: max_concurrency (adds, default 3) and update_concurrency (modifies, default 1). Updates default to 1 because orphan-cleanup correctness under concurrent updates depends on a pipeline-ordering invariant; raising that default is safe only if you've verified your concurrent updates can never share an entity. The integration test test_concurrent_updates_preserve_shared_entity is the tripwire that guards the default.

Ingestion & Retrieval Pipeline

Area	Step	Cost
Ingestion	Extract entities & relations	LLM
Ingestion	Resolve & deduplicate entities	LLM
Ingestion	Embed & index	LLM
Retrieval	Vector search	DB
Retrieval	Full-text search	DB
Retrieval	Text-to-Cypher (experimental)	LLM
Retrieval	Cypher queries	DB
Retrieval	Relationship expansion	DB
Retrieval	Cosine reranking	Local

💡 Every answer is traceable to its source chunks via MENTIONS edges. Pass return_context=True to completion() to get the retrieval trail alongside the answer.

Examples

Working starters — clone, plug in your source, ship.

#	Example	What you'll build
1	Quick Start	Your first ingest-and-query loop in under 30 lines
2	PDF with Schema	A PDF Q&A bot with your own entity and relation types
3	Custom Strategies	The benchmark-winning pipeline, ready to drop in
4	Custom Provider	Plug in any LLM or embedder behind a clean interface
5	Notebook Demo	An interactive walkthrough that shows the provenance trail
7	Incremental Updates	`update`, `delete_document`, and `apply_changes` for CI-driven graph syncs

Documentation

Guide	Description
Getting Started	Step-by-step tutorial from install to first query
Architecture	Pipeline design, graph schema, retrieval strategy
Configuration	Connection, providers, and tuning reference
Strategies	All ABCs and built-in implementations
Providers	LLM and embedder configuration guide
Benchmark	Methodology, results, and reproduction instructions
API Reference	Full API documentation

Development Milestones

2024-06: First public release
2024-Q4: PDF ingestion and multi-provider LLMs
2025-Q1–Q2: Pluggable providers and pipeline tuning
2025-Q3: Sharper retrieval, deeper test coverage
🎉 2026-04: Version 1.0 is released with a new set of benchmarks based on a year's worth of research and customer PoCs
- 📦 Still on the v0.x API? Pin the legacy release: pip install graphrag-sdk==0.8.2
2026-Q2: Production observability; expand ingestion support — tables, structured data
2026-Q3: Introduce Agentic GraphRAG; complete PDF ingestion
2026-Q4: Smarter retrieval — dynamic traversal, temporal graph

Contributing

We welcome contributions! See CONTRIBUTING.md for development setup, testing, and code style guidelines.

Please read our Code of Conduct before participating.

Community

Discord -- Ask questions, share what you build
GitHub Discussions -- Feature ideas, Q&A
Issues -- Bug reports and feature requests

Citation

If you use GraphRAG SDK in your research, please cite:

@software{graphrag_sdk,
  title  = {GraphRAG SDK: A Modular Graph RAG Framework},
  author = {FalkorDB},
  year   = {2026},
  url    = {https://github.com/FalkorDB/GraphRAG-SDK},
}

License

Apache License 2.0