π GraphRAG Local Ollama - Knowledge Graph
May 8, 2026 Β· View on GitHub
π€ Contributing
We welcome contributions from the community to help enhance GraphRAG Local Ollama! Please see our Contributing Guidelines for more details on how to get involved.
Need support for llama integration.
π GraphRAG Local Ollama - Knowledge Graph
Welcome to GraphRAG Local Ollama! This repository is an exciting adaptation of Microsoft's GraphRAG, tailored to support local models downloaded using Ollama. Say goodbye to costly OpenAPI models and hello to efficient, cost-effective local inference using Ollama!
π Research Paper
For more details on the GraphRAG implementation, please refer to the GraphRAG paper.
Paper Abstract
The use of retrieval-augmented generation (RAG) to retrieve relevant information from an external knowledge source enables large language models (LLMs)to answer questions over private and/or previously unseen document collections.However, RAG fails on global questions directed at an entire text corpus, suchas βWhat are the main themes in the dataset?β, since this is inherently a queryfocused summarization (QFS) task, rather than an explicit retrieval task. PriorQFS methods, meanwhile, fail to scale to the quantities of text indexed by typicalRAG systems. To combine the strengths of these contrasting methods, we proposea Graph RAG approach to question answering over private text corpora that scaleswith both the generality of user questions and the quantity of source text to be indexed. Our approach uses an LLM to build a graph-based text index in two stages:first to derive an entity knowledge graph from the source documents, then to pregenerate community summaries for all groups of closely-related entities. Given aquestion, each community summary is used to generate a partial response, beforeall partial responses are again summarized in a final response to the user. For aclass of global sensemaking questions over datasets in the 1 million token range,we show that Graph RAG leads to substantial improvements over a naΒ¨Δ±ve RAGbaseline for both the comprehensiveness and diversity of generated answers.
π Features
- Local Model Support: Leverage local models with Ollama for LLM and embeddings.
- Cost-Effective: Eliminate dependency on costly OpenAPI models.
- Easy Setup: Simple and straightforward setup process.
- Web UI: Browser-based interface for indexing, querying, and graph visualization.
- 5 Query Modes: Global, Local, DRIFT (iterative reasoning), Basic (fast vector search), Lazy.
- LazyGraphRAG Mode: Skip community summarization at index time β ~99% faster indexing, on-demand summarization at query time.
- DRIFT Search: Dynamic Reasoning with Iterative Feedback and Tracking β iterative knowledge-graph exploration that generates and follows up sub-questions for deeper answers.
- Basic Search: Lightweight vector-similarity RAG β no community reports needed, fast answers from entity embeddings.
- JSON/JSONL Input: Index JSON and JSONL document collections in addition to
.txtand.csv. - Multilingual: Full UTF-8 support with CJK-aware text chunking (Chinese, Japanese, Korean, Arabic, Cyrillic).
- Docker: One-command setup with
docker-compose up.
π³ Quick Start (Docker)
The fastest way to get started β no Python environment needed.
# 1. Copy the example environment file
cp .env.example .env
# 2. Start Ollama + GraphRAG
docker-compose up --build
# 3. Drop your .txt documents into ./input/
# Indexing runs automatically on container start.
# 4. Run a query
docker-compose run graphrag python -m graphrag.query \
--root /app --method global "What are the main themes?"
GPU support: Add
deploy.resources.reservations.devicesto theollamaservice indocker-compose.ymlfor GPU acceleration.
π₯οΈ Web UI
A browser-based interface for the full workflow (indexing β querying β visualization).
pip install -r requirements-ui.txt
python app.py
# Open http://localhost:7860
Tabs:
| Tab | What it does |
|---|---|
| π Index | Upload .txt files, run indexing, see live log output |
| π Query | Ask questions using Global / Local / DRIFT / Basic / Lazy search |
| πΊοΈ Graph | Interactive knowledge-graph visualizer (requires pyvis) |
| βοΈ Settings | Edit model names, chunk size, LazyGraphRAG toggle |
β‘ LazyGraphRAG Mode
Inspired by Microsoft's LazyGraphRAG, this mode skips community summarization at index time and generates summaries on-the-fly during queries. Result: indexing is ~99% faster.
Enable in settings.yaml:
lazy_graph_rag: true
Query with the lazy method:
python -m graphrag.query --root ./ragtest --method lazy "What is machine learning?"
Trade-off: First-query responses are slightly slower than standard global search because summaries are computed at query time. For large datasets this is still dramatically cheaper overall.
π DRIFT Search
Inspired by Microsoft GraphRAG v0.4+'s DRIFT (Dynamic Reasoning with Iterative Feedback and Tracking), this mode performs iterative knowledge-graph exploration:
- Primer phase β decomposes your question into scored sub-questions using community reports as context.
- Search loop β answers each sub-question using the entity graph, then generates follow-up questions from the answers (priority-queue, depth-limited).
- Reduce phase β synthesises all intermediate answers into a final comprehensive response.
DRIFT produces deeper, more exploratory answers than Global or Local search β especially useful for open-ended research questions.
Query with DRIFT:
python -m graphrag.query --root ./ragtest --method drift "What are the main causes of the conflict?"
Note: DRIFT works best when community reports are available. If
lazy_graph_rag: truewas used during indexing, the primer step will have limited context but the search loop still operates normally.
β‘ Basic Search
A lightweight vector-similarity RAG mode β no community reports required. Works immediately after a minimal index run.
- Embeds the query using your Ollama embedding model.
- Retrieves the top-10 most similar entities from the vector store.
- Builds context from those entities, their relationships, and source text units.
- Answers with the LLM.
Use Basic search for fast, factual lookups when you don't need the full graph-traversal reasoning.
Query with Basic search:
python -m graphrag.query --root ./ragtest --method basic "Who is Alan Turing?"
π JSON / JSONL Input
In addition to .txt and .csv, you can now index JSON and JSONL document collections.
JSON array format (documents.json):
[
{"id": "doc1", "title": "Introduction", "text": "This is the document text..."},
{"id": "doc2", "text": "Another document..."}
]
JSONL format (documents.jsonl, one JSON object per line):
{"id": "doc1", "title": "Chapter 1", "text": "Text of chapter 1..."}
{"id": "doc2", "title": "Chapter 2", "text": "Text of chapter 2..."}
Enable in settings.yaml:
input:
type: file
file_type: json # was: text or csv
base_dir: "input"
file_pattern: ".*\\.(json|jsonl)$"
Required field: text. Optional fields: id, title, source.
π Non-English Text
UTF-8 is enforced throughout the pipeline. CJK (Chinese, Japanese, Korean), Arabic, and Cyrillic documents are supported:
- Text files are read with
encoding="utf-8"(with graceful replacement for undecodable bytes). - CSVs default to
utf-8(waslatin-1β fixed in this release). - The text splitter automatically switches to character-count chunking for CJK-dominant text, avoiding BPE tokenization artifacts.
No configuration change needed β just drop your non-English .txt files in input/ and run normally.
π¦ Installation and Setup
Follow these steps to set up this repository and use GraphRag with local models provided by Ollama :
-
Create and activate a new conda environment: (please stick to the given python version 3.10 for no errors)
conda create -n graphrag-ollama-local python=3.10 conda activate graphrag-ollama-local -
Install Ollama:
- Visit Ollama's website for installation instructions.
- Or, run:
curl -fsSL https://ollama.com/install.sh | sh #ollama for linux pip install ollama -
Download the required models using Ollama, we can choose from (mistral,gemma2, qwen2) for llm and any embedding model provided under Ollama:
ollama pull mistral #llm ollama pull nomic-embed-text #embedding -
Clone the repository:
git clone https://github.com/TheAiSingularity/graphrag-local-ollama.git -
Navigate to the repository directory:
cd graphrag-local-ollama/ -
Install the graphrag package ** This is the most important step :
pip install -e . -
Create the required input directory: This is where the experiments data and results will be stored - ./ragtest
mkdir -p ./ragtest/input -
Copy sample data folder input/ to ./ragtest. Input/ has the sample data to run the setup. You can add your own data here in .txt format.
cp input/* ./ragtest/input -
Initialize the ./ragtest folder to create the required files:
python -m graphrag.index --init --root ./ragtest -
Move the settings.yaml file, this is the main predefined config file configured with ollama local models :
cp settings.yaml ./ragtest
Users can experiment by changing the models. The llm model expects language models like llama3, mistral, phi3, etc., and the embedding model section expects embedding models like mxbai-embed-large, nomic-embed-text, etc., which are provided by Ollama. You can find the complete list of models provided by Ollama here https://ollama.com/library, which can be deployed locally. Both LLM and embeddings use the OpenAI-compatible endpoint http://localhost:11434/v1. No API key is required β the config defaults to ollama as a placeholder.


-
Run the indexing, which creates a graph:
python -m graphrag.index --root ./ragtest -
Run a query (five methods available):
# Global β broad synthesis across the whole corpus (uses community reports) python -m graphrag.query --root ./ragtest --method global "What is machine learning?" # Local β entity-focused, uses knowledge graph + text chunks python -m graphrag.query --root ./ragtest --method local "What is machine learning?" # DRIFT β iterative graph reasoning (decomposes β explores β synthesises) python -m graphrag.query --root ./ragtest --method drift "What are the main themes?" # Basic β fast vector-similarity search (no community reports needed) python -m graphrag.query --root ./ragtest --method basic "Who invented the transformer?" # Lazy β on-demand community summarisation (use with lazy_graph_rag: true) python -m graphrag.query --root ./ragtest --method lazy "What is machine learning?"
Graphs can be saved which further can be used for visualization by changing the graphml to "true" in the settings.yaml :
snapshots:
graphml: true
To visualize the generated graphml files, you can use : https://gephi.org/users/download/ or the script provided in the repo visualize-graphml.py :
Pass the path to the .graphml file to the below line in visualize-graphml.py:
graph = nx.read_graphml('output/20240708-161630/artifacts/summarized_graph.graphml')
13. Visualize .graphml :
```bash
python visualize-graphml.py
```
Citations
By following the above steps, you can set up and use local models with GraphRAG, making the process more cost-effective and efficient.