KNighter: Transforming Static Analysis with LLM-Synthesized Checkers
August 29, 2025 ยท View on GitHub
Table of Contents
About
KNighter is an innovative checker synthesis tool that leverages Large Language Models (LLMs) to automatically generate static analysis checkers from historical patch commits.
Key Features
- ๐ค LLM-Powered Generation: Automatically synthesizes static analysis checkers using state-of-the-art language models
- ๐ Multi-step Pipeline: Employs a sophisticated generation โ refinement โ triage workflow for high-quality results
- ๐ Historical Learning: Learns from real-world patch commits to understand common bug patterns
- โก LLVM Integration: Built on top of LLVM for robust static analysis capabilities
- ๐ง Linux Kernel Focus: Specialized for finding bugs in large-scale C/C++ codebases like the Linux kernel
The detected bugs ๐ can be found here.
Important
We are continuously improving the documentation and adding new features. Please stay tuned for updates.
Getting Started
Docker Setup (Recommended)
๐ณ Docker Installation Options
Option 1: Docker Hub (Recommended)
docker pull knighterhub/knighter
Option 2: Build from Source
git clone https://github.com/ise-uiuc/KNighter.git KNighter
cd KNighter
docker build -t knighter .
๐ Running the Container
# Pull from Docker Hub
docker run -it knighterhub/knighter
# Build from source
docker run -it knighter
โ๏ธ Environment Initialization
When running the container for the first time, initialize the environment:
cd /app
# This would take a while to download the dependencies and compile the LLVM
python3 scripts/init_docker.py
This downloads LLVM and Linux kernel source code into /data/llvm and /data/linux.
API Key Configuration:
echo 'openai_key: "YOUR_OPENAI_API_KEY"' > /app/llm_keys.yaml
Manual Environment Setup (Alternative)
Note: For detailed setup steps, refer to
scripts/init_docker.pywhich contains the complete initialization process.
๐ง Manual Installation Steps
Step 1: Install Dependencies
Download and build LLVM-18.1.8:
wget https://github.com/llvm/llvm-project/archive/refs/tags/llvmorg-18.1.8.zip
unzip llvmorg-18.1.8.zip
Git clone the Linux kernel source code:
git clone https://github.com/torvalds/linux.git
Install Python dependencies:
# Option 1: Using uv (recommended for faster installs)
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.cargo/env
uv pip install -r requirements.txt
# Option 2: Using regular pip
pip3 install -r requirements.txt
git submodule update --init --recursive
Step 2: Configuration Files
Set up your config.yaml (see scripts/init_docker.py for reference):
result_dir: "result-checkers"
LLVM_dir: "/PATH/TO/LLVM_DIR"
checker_nums: 10
linux_dir: "/PATH/TO/LINUX_DIR"
key_file: "llm_keys.yaml"
model: "o3-mini"
Set up the llm_keys.yaml file (see llm_keys_example.yaml for reference):
openai_key: "sk-..."
claude_key: "sk-ant-..."
google_key: "AIza..."
deepseek_key: "sk-..."
# For local models (optional)
# In config, use "local:model_name" format to use local models
# Like "local:openai/gpt-oss-120b"
base_url: "http://localhost:8000/v1"
api_key: "dummy"
Step 3: LLVM Setup
python3 scripts/setup_llvm.py LLVM_PATH
Running KNighter
Quick Start (Docker)
For rapid evaluation, use the debug dataset:
cd /app/src
# Step 1: Generate checkers for debug commits
python3 main.py gen --config_file /app/config-generate.yaml --commit_file=/app/commits/commits-debug.txt
# Step 2: Refine generated checkers
python3 main.py refine --config_file /app/config-refine-debug.yaml /app/result-generate
# Step 3: Triage and analyze results
python3 main.py triage --config_file /app/config-triage-debug.yaml /app/result-refine-debug
๐ Pipeline Modes & Usage
Available Operation Modes:
| Mode | Purpose | Description |
|---|---|---|
gen | Generation | Generate new checkers from commit patches |
refine | Refinement | Improve and validate generated checkers |
scan | Scanning | Scan the kernel with validated checkers |
triage | Analysis | Analyze and categorize scan results |
Basic Usage (Manual Setup):
cd src
python3 main.py <mode> --commit_file=<commits.txt> --config_file=<config.yaml>
Example:
python3 main.py gen --commit_file=../commits/commits-selected.txt --config_file=config.yaml
โ๏ธ Configuration Files
| File | Purpose | Key Parameters |
|---|---|---|
config-generate.yaml | Checker generation | model, checker_nums, result_dir |
config-refine.yaml | Refinement process | jobs, scan_timeout, scan_commit |
config-triage.yaml | Result analysis | Analysis parameters |
Modify these files to experiment with different parameters from the paper evaluation.
Architecture Documentation
๐๏ธ System Architecture Overview
KNighter implements a multi-stage pipeline for automated checker synthesis:
- Commit Analysis: Extract bug patterns from historical patches
- Checker Generation: Use LLMs to synthesize static analysis checkers
- Refinement: Validate and improve generated checkers through compilation and testing
- Deployment: Apply refined checkers to target codebases
- Triage: Analyze and categorize detected issues
For comprehensive architecture documentation, see ARCHITECTURE.md.
Citation: If you use KNighter in your research, please cite our paper:
@inproceedings{knighter,
title = {KNighter: Transforming Static Analysis with LLM-Synthesized Checkers},
author = {Yang, Chenyuan and Zhao, Zijie and Xie, Zichen and Li, Haoyu and Zhang, Lingming},
year = {2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3731569.3764827},
doi = {10.1145/3731569.3764827},
booktitle = {Proceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles},
location = {Seoul, Republic of Korea},
series = {SOSP '25}
}