Repository Guidelines
February 19, 2026 · View on GitHub
You are an agent - please keep going until the user’s query is completely resolved, before ending your turn and yielding back to the user.
Your thinking should be thorough and so it's fine if it's very long. However, avoid unnecessary repetition and verbosity. You should be concise, but thorough.
You MUST iterate and keep going until the problem is solved.
You have everything you need to resolve this problem. I want you to fully solve this autonomously before coming back to me.
Only terminate your turn when you are sure that the problem is solved and all items have been checked off. Go through the problem step by step, and make sure to verify that your changes are correct. NEVER end your turn without having truly and completely solved the problem, and when you say you are going to make a tool call, make sure you ACTUALLY make the tool call, instead of ending your turn.
THE PROBLEM CAN NOT BE SOLVED WITHOUT EXTENSIVE INTERNET RESEARCH.
You must use the fetch_webpage tool to recursively gather all information from URL's provided to you by the user, as well as any links you find in the content of those pages.
Your knowledge on everything is out of date because your training date is in the past.
You CANNOT successfully complete this task without using Google to verify your understanding of third party packages and dependencies is up to date. You must use the fetch_webpage tool to search google for how to properly use libraries, packages, frameworks, dependencies, etc. every single time you install or implement one. It is not enough to just search, you must also read the content of the pages you find and recursively gather all relevant information by fetching additional links until you have all the information you need.
Always tell the user what you are going to do before making a tool call with a single concise sentence. This will help them understand what you are doing and why.
If the user request is "resume" or "continue" or "try again", check the previous conversation history to see what the next incomplete step in the todo list is. Continue from that step, and do not hand back control to the user until the entire todo list is complete and all items are checked off. Inform the user that you are continuing from the last incomplete step, and what that step is.
Take your time and think through every step - remember to check your solution rigorously and watch out for boundary cases, especially with the changes you made. Use the sequential thinking tool if available. Your solution must be perfect. If not, continue working on it. At the end, you must test your code rigorously using the tools provided, and do it many times, to catch all edge cases. If it is not robust, iterate more and make it perfect. Failing to test your code sufficiently rigorously is the NUMBER ONE failure mode on these types of tasks; make sure you handle all edge cases, and run existing tests if they are provided.
You MUST plan extensively before each function call, and reflect extensively on the outcomes of the previous function calls. DO NOT do this entire process by making function calls only, as this can impair your ability to solve the problem and think insightfully.
You MUST keep working until the problem is completely solved, and all items in the todo list are checked off. Do not end your turn until you have completed all steps in the todo list and verified that everything is working correctly. When you say "Next I will do X" or "Now I will do Y" or "I will do X", you MUST actually do X or Y instead just saying that you will do it.
You are a highly capable and autonomous agent, and you can definitely solve this problem without needing to ask the user for further input.
Workflow Fetch any URL's provided by the user using the fetch_webpage tool. Understand the problem deeply. Carefully read the issue and think critically about what is required. Use sequential thinking to break down the problem into manageable parts. Consider the following: What is the expected behavior? What are the edge cases? What are the potential pitfalls? How does this fit into the larger context of the codebase? What are the dependencies and interactions with other parts of the code? Investigate the codebase. Explore relevant files, search for key functions, and gather context. Research the problem on the internet by reading relevant articles, documentation, and forums. Develop a clear, step-by-step plan. Break down the fix into manageable, incremental steps. Display those steps in a simple todo list using standard markdown format. Make sure you wrap the todo list in triple backticks so that it is formatted correctly. Implement the fix incrementally. Make small, testable code changes. Debug as needed. Use debugging techniques to isolate and resolve issues. Test frequently. Run tests after each change to verify correctness. Iterate until the root cause is fixed and all tests pass. Reflect and validate comprehensively. After tests pass, think about the original intent, write additional tests to ensure correctness, and remember there are hidden tests that must also pass before the solution is truly complete. Refer to the detailed sections below for more information on each step.
-
Fetch Provided URLs If the user provides a URL, use the functions.fetch_webpage tool to retrieve the content of the provided URL. After fetching, review the content returned by the fetch tool. If you find any additional URLs or links that are relevant, use the fetch_webpage tool again to retrieve those links. Recursively gather all relevant information by fetching additional links until you have all the information you need.
-
Deeply Understand the Problem Carefully read the issue and think hard about a plan to solve it before coding.
-
Codebase Investigation Explore relevant files and directories. Search for key functions, classes, or variables related to the issue. Read and understand relevant code snippets. Identify the root cause of the problem. Validate and update your understanding continuously as you gather more context.
-
Internet Research Use the fetch_webpage tool to search google by fetching the URL https://www.google.com/search?q=your+search+query. After fetching, review the content returned by the fetch tool. If you find any additional URLs or links that are relevant, use the fetch_webpage tool again to retrieve those links. Recursively gather all relevant information by fetching additional links until you have all the information you need.
-
Develop a Detailed Plan Outline a specific, simple, and verifiable sequence of steps to fix the problem. Create a todo list in markdown format to track your progress. Each time you complete a step, check it off using [x] syntax. Each time you check off a step, display the updated todo list to the user. Make sure that you ACTUALLY continue on to the next step after checkin off a step instead of ending your turn and asking the user what they want to do next.
-
Making Code Changes Before editing, always read the relevant file contents or section to ensure complete context. Always read 2000 lines of code at a time to ensure you have enough context. If a patch is not applied correctly, attempt to reapply it. Make small, testable, incremental changes that logically follow from your investigation and plan.
-
Debugging Use the get_errors tool to check for any problems in the code Make code changes only if you have high confidence they can solve the problem When debugging, try to determine the root cause rather than addressing symptoms Debug for as long as needed to identify the root cause and identify a fix Use print statements, logs, or temporary code to inspect program state, including descriptive statements or error messages to understand what's happening To test hypotheses, you can also add test statements or functions Revisit your assumptions if unexpected behavior occurs. How to create a Todo List Use the following format to create a todo list:
- Step 1: Description of the first step
- Step 2: Description of the second step
- Step 3: Description of the third step Do not ever use HTML tags or any other formatting for the todo list, as it will not be rendered correctly. Always use the markdown format shown above.
Communication Guidelines Always communicate clearly and concisely in a casual, friendly yet professional tone.
Task logs At the end of each turn include a "Task logs" section with a concise, actionable summary:
Actions: one-line list of key actions performed this turn. Decisions: one-line list of key decisions or assumptions. Next steps: one-line list of immediate follow-ups or test steps. Lessons/insights: one-line summary of what was learned. Save the log using this filename template:
YYYY-MM-DD-HH-mm-beastmode-chatmode-log.md
Example: 2024-06-15-14-30-refactoring-code.md
In /logs directory.
Do NOT create a user-facing "comprehensive summary" that starts with or resembles: "Great! Now let me create a comprehensive summary ..." This must never be produced. Instead, use only concise, machine-actionable "Task logs" at the end of each turn. This overrides previous SUMMARY actions and legacy logging.
NEVER CREATE META DOCUMENTATION SUMMARIES ! Only use concise "Task logs" as described above.
How you communicate your thoughts "Let me fetch the URL you provided to gather more information." "Ok, I've got all of the information I need on the LIFX API and I know how to use it." "Now, I will search the codebase for the function that handles the LIFX API requests." "I need to update several files here - stand by" "OK! Now let's run the tests to make sure everything is working correctly." "Whelp - I see we have some problems. Let's fix those up."
Repository Guidelines
EdgeQuake is an advanced Retrieval-Augmented Generation (RAG) framework implemented in Rust, designed to enhance information retrieval and generation through graph-based knowledge representation.
You must respect SRP and DRY principles, and keep functions small and focused. Always look for opportunities to refactor and improve code quality as you work.
Project Structure & Module Organization
edgequake/crates/: Core Rust cratesedgequake-core/: Orchestration layer with pipeline and EdgeQuake APIedgequake-llm/: LLM provider implementations (OpenAI, Mock)edgequake-storage/: Storage adapters (Memory, PostgreSQL AGE)edgequake-api/: REST API service with Axumedgequake-pipeline/: Document processing pipelineedgequake-query/: Query engine for knowledge graph
edgequake/examples/: Production examples and demosedgequake/tests/: Integration and E2E testslightrag/: Legacy Python implementation (being replaced)lightrag_webui/: React 19 + TypeScript client driven by Bun + Vitedocs/: Comprehensive documentation including production guides
Important Ensure to keep the files small and modular for maintainability.
Build, Test, and Development Commands
cargo build: Build the entire workspacecargo test: Run all tests (uses mock provider by default)export OPENAI_API_KEY="sk-..." && cargo test: Run tests with real OpenAI providercargo run --example production_pipeline: Run production example with real LLMcargo clippy: Lint Rust code before committingcargo fmt: Format Rust codebun install,bun run dev,bun run build,bun test: Manage web UI workflow
Quick Start with make
The make dev command starts the full stack with Ollama as the default provider:
# Start with Ollama (default)
make dev
# Start with OpenAI provider available for runtime switching
export OPENAI_API_KEY="sk-your-key"
make dev
# Check service status
make status
When OPENAI_API_KEY is set, you can switch between Ollama and OpenAI providers at runtime via the query UI or API.
Background Testing (Agentic Mode)
For automated testing or continuous integration, use background mode to run services non-interactively:
# Start full stack in background (database + backend + frontend)
make dev-bg
# Check service health
make status
# View logs
tail -f /tmp/edgequake-backend.log
tail -f /tmp/edgequake-frontend.log
# Stop all services
make stop
Alternative commands:
make backend-bg: Start backend only in background with PostgreSQL
Note: In-memory storage mode has been removed.
DATABASE_URLis now required for all server modes. Running without a database will cause the server to exit with error code 1.
Service Management & E2E Testing
Service Health Checks
After starting services with make dev-bg, verify each component is healthy:
# Backend health check (should return JSON with "status":"healthy")
curl http://localhost:8080/health
# Frontend health check (should return HTML)
curl -I http://localhost:3000
# PostgreSQL health check
docker ps | grep edgequake-postgres
Expected Backend Response:
{
"status": "healthy",
"version": "0.1.0",
"storage_mode": "postgresql",
"workspace_id": "default",
"components": {
"kv_storage": true,
"vector_storage": true,
"graph_storage": true,
"llm_provider": true
},
"llm_provider_name": "ollama"
}
Log File Locations
When services run in background mode, logs are written to:
- Backend:
/tmp/edgequake-backend.log - Frontend:
/tmp/edgequake-frontend.log
Viewing Logs:
# Tail backend logs
tail -f /tmp/edgequake-backend.log
# Tail frontend logs
tail -f /tmp/edgequake-frontend.log
# Search for errors
grep -i error /tmp/edgequake-backend.log
grep -i "failed\|error" /tmp/edgequake-frontend.log
Port Mappings
| Service | Port | Purpose |
|---|---|---|
| Frontend (Next.js) | 3000 | Web UI |
| Backend (Axum) | 8080 | REST API |
| PostgreSQL | 5432 | Database |
| Ollama (optional) | 11434 | Local LLM provider |
Known Issues & Workarounds
Frontend PID Management
Issue: Frontend process may die but PID file (edgequake_webui/build_pid.txt) remains, causing make stop to fail silently.
Workaround:
# Check if frontend is actually running
lsof -i :3000
# If port is free but PID file exists, manually restart:
cd edgequake_webui
rm -f build_pid.txt
bun run dev &
echo $! > build_pid.txt
Permanent Fix: See specs/001-e2e-upload-pdf/ooda/iteration_03/ (planned enhancement).
Ollama Service Required
Issue: Entity extraction fails with "Network error" if Ollama is not running.
Workaround:
# Check Ollama status
curl http://localhost:11434/api/tags
# Start Ollama if not running
ollama serve &
# Or use OpenAI instead:
export OPENAI_API_KEY="sk-your-key"
make dev-bg
Error Symptom: Documents show status "Failed" with message "Pipeline processing failed: Entity extraction e...".
MCP Playwright E2E Testing
EdgeQuake uses MCP Playwright for interactive E2E testing. This allows AI agents to automate browser interactions.
Prerequisites
# Install Playwright browsers (via MCP tool or manually)
cd edgequake_webui
pnpm install
npx playwright install chrome
Test Execution
Via MCP Tool (for AI agents):
// Navigate to documents page
mcp_microsoft_pla_browser_navigate({ url: "http://localhost:3000/documents" });
// Take snapshot
mcp_microsoft_pla_browser_snapshot({});
// Click element
mcp_microsoft_pla_browser_click({ ref: "e175", element: "First document row" });
Via Command Line (for humans):
cd edgequake_webui
pnpm exec playwright test
pnpm exec playwright test --ui # Interactive mode
pnpm exec playwright show-report # View last run
Test Structure
edgequake_webui/e2e/
├── markdown-test.spec.ts # Markdown rendering tests
├── upload-pdf.spec.ts # PDF upload flow (planned)
└── side-by-side-viewer.spec.ts # Side-by-side viewer (planned)
Common E2E Test Scenarios
1. Verify PDF Upload & Display:
test("upload PDF and view side-by-side", async ({ page }) => {
await page.goto("http://localhost:3000/documents");
await page.click('button:has-text("Upload PDF")');
await page.setInputFiles(
'input[type="file"]',
"zz_test_docs/lighrag_2410.05779v3.pdf",
);
await page.waitForSelector('[data-testid="side-by-side-viewer"]');
// Verify PDF panel
await expect(page.locator('[data-testid="pdf-viewer"]')).toBeVisible();
// Verify markdown panel
await expect(page.locator('[data-testid="markdown-renderer"]')).toBeVisible();
});
2. Check Entity Extraction Progress:
test("monitor entity extraction", async ({ page }) => {
await page.goto(
"http://localhost:3000/documents/f6fa9cad-bbff-4892-a855-3bd7d70da044",
);
// Wait for processing to complete (may take 5-10 minutes)
await page.waitForSelector('text="Completed"', { timeout: 600000 });
// Verify entities extracted
const entityCount = await page
.locator('[data-testid="entity-count"]')
.textContent();
expect(parseInt(entityCount)).toBeGreaterThan(0);
});
Troubleshooting Guide
Problem: Frontend Won't Start
Symptoms:
make dev-bgcompletes but http://localhost:3000 returns "Connection refused"/tmp/edgequake-frontend.logshows compilation errors or empty
Solution:
# Check if process is running
ps aux | grep "bun run dev"
# Kill stale process
killall -9 node bun
# Remove PID file
rm -f edgequake_webui/build_pid.txt
# Restart manually
cd edgequake_webui
bun install # Ensure dependencies are installed
bun run dev &
echo $! > build_pid.txt
# Verify it started
curl -I http://localhost:3000
Problem: Backend Won't Start
Symptoms:
make dev-bghangs or fails- http://localhost:8080/health returns "Connection refused"
/tmp/edgequake-backend.logshows database errors
Solution:
# Check PostgreSQL container
docker ps | grep edgequake-postgres
# If not running, start it:
make postgres-start
# Wait 5 seconds for DB to be ready
sleep 5
# Restart backend
make backend-bg
# Verify it started
curl http://localhost:8080/health
Problem: PDF Extraction Fails
Symptoms:
- Document status shows "Failed" with "Pipeline processing failed: ..."
- Side-by-side viewer shows PDF but no markdown
Solution (v0.4.0+):
Since v0.4.0, pdfium is embedded in the binary via edgequake-pdf2md v0.4.1. No external
library or environment variable setup is needed. If PDF extraction fails, check:
# 1. Ensure the vision LLM provider is accessible
curl http://localhost:8080/health | python3 -m json.tool
# 2. Verify Ollama is running (if using Ollama vision)
curl http://localhost:11434/api/tags
# 3. Check backend logs for the specific error
grep -i "Failed\|error" /tmp/edgequake-backend.log | tail -20
# 4. Restart and retry
make stop
make dev-bg
Note: There is no PDFIUM_DYNAMIC_LIB_PATH required since v0.4.0. The binary includes pdfium compiled for your platform via pdfium-auto.
Problem: Entity Extraction Fails
Symptoms:
- Document status shows "Failed" with "Network error: error sending request for url (http://localhost:11434/api/chat)"
- PDF and markdown display correctly, but no entities extracted
Solution:
# Check if Ollama is running
curl http://localhost:11434/api/tags
# If not running:
ollama serve &
# Verify models are pulled:
ollama list
# If qwen2.5 is missing:
ollama pull qwen2.5:latest
# Re-upload document to retry extraction
# (or wait for automatic retry in future iteration)
Alternative: Use OpenAI instead of Ollama:
export OPENAI_API_KEY="sk-your-key"
make stop
make dev-bg
Problem: Stale Frontend Cache
Symptoms:
- Document shows "Processing..." indefinitely even though backend shows "Completed"
- Side-by-side viewer displays old content
Solution:
# Hard refresh in browser
# Chrome/Firefox: Cmd+Shift+R (macOS) or Ctrl+Shift+R (Windows/Linux)
# Or clear React Query cache by restarting frontend:
make stop
make dev-bg
# Or use incognito/private browsing mode
OODA Loop Documentation
This service management guide was created during OODA Iteration 02 of the PDF upload/extraction fix.
Reference: specs/001-e2e-upload-pdf/ooda/iteration_02/
Key Learnings:
make dev-bgreliably starts all services with correct environment variables- MCP Playwright enables AI-driven E2E testing for verification
- Frontend PID management needs improvement (see iteration 03 plan)
- Ollama service must be running for entity extraction (separate from PDF extraction)
Mission Status: ✅ PDF extraction and side-by-side display verified working (2026-02-06)
Developer Workflow Guide
Mission-Tested Workflow: This guide is based on learnings from the Reliable Ingestion Mission (OODA iterations 01-05). Follow these steps for a smooth development experience.
Prerequisites Checklist
Before starting development, ensure you have:
- Docker installed and running (for PostgreSQL)
- Rust toolchain (run
rustup updateto ensure latest) - Ollama installed for local LLM (
brew install ollamaon macOS) - Node.js & pnpm for frontend development
- PostgreSQL knowledge: EdgeQuake uses pgvector + Apache AGE
Step-by-Step Startup
# 1. Clone and navigate to repository
cd edgequake
# 2. Start PostgreSQL database (required - no memory fallback)
make postgres-start
# 3. Start Ollama (required for entity extraction)
ollama serve &
# 4. Pull required model (first time only)
ollama pull gemma3:latest
# 5. Start full stack
make dev
# 6. Verify all services are healthy
make status
Service Verification Commands
| Check | Command | Expected Result |
|---|---|---|
| Backend API | curl http://localhost:8080/health | {"status":"healthy","storage_mode":"postgresql"} |
| Frontend UI | curl -I http://localhost:3000 | HTTP 200 OK |
| PostgreSQL | docker ps | grep postgres | Container running |
| Ollama | curl http://localhost:11434/api/tags | List of models |
LLM Provider Selection
EdgeQuake supports two LLM providers at runtime:
| Provider | When to Use | Setup |
|---|---|---|
| Ollama (default) | Development, local testing, no API costs | ollama serve & |
| OpenAI | Production, higher quality extraction | export OPENAI_API_KEY="sk-..." |
Important: If using OpenAI, prefer gpt-5-nano over deprecated gpt-4o-mini.
Testing After Code Changes
# Quick test for specific crate
cargo test -p edgequake-api --lib
# Full test suite (641+ tests)
cargo test --workspace --lib
# Linting (must pass before commit)
cargo clippy --all-targets
# Format check
cargo fmt --check
Common Development Scenarios
Scenario 1: Testing PDF Upload
# 1. Ensure services are running
make status
# 2. Open browser to documents page
open http://localhost:3000/documents
# 3. Upload a test PDF from:
# - zz_test_docs/lighrag_2410.05779v3.pdf
# - zz-explore/EMILE_FREY/*.pdf
# 4. Watch status change: Uploading → Processing → Completed
Scenario 2: Debugging Entity Extraction
# Check Ollama is responding
curl http://localhost:11434/api/tags
# View backend logs for extraction details
tail -f /tmp/edgequake-backend.log | grep -i entity
# If extraction fails, check pipeline errors:
grep -i "error\|failed" /tmp/edgequake-backend.log
Scenario 3: Database Issues
# Check if PostgreSQL is running
docker ps | grep edgequake-postgres
# Restart database if needed
make postgres-stop
make postgres-start
# Wait for database to be ready
sleep 5
# Restart backend
make backend-bg
Environment Variables Reference
| Variable | Required | Purpose | Example |
|---|---|---|---|
DATABASE_URL | ✅ Yes | PostgreSQL connection | postgres://edgequake:edgequake@localhost/edgequake |
OPENAI_API_KEY | Optional | Enable OpenAI provider | sk-proj-... |
EDGEQUAKE_LLM_PROVIDER | Optional | Override LLM provider | openai, ollama, lmstudio, mock |
EDGEQUAKE_EMBEDDING_PROVIDER | Optional | Hybrid mode: separate embedding | ollama (use with EDGEQUAKE_LLM_PROVIDER=openai) |
OLLAMA_HOST | Optional | Ollama server URL | http://localhost:11434 |
OLLAMA_EMBEDDING_MODEL | Optional | Ollama embedding model | embeddinggemma:latest |
RUST_LOG | Optional | Logging level | debug, info, warn |
Hybrid Provider Mode (SPEC-033)
Use different providers for LLM and embeddings. Useful when:
- OpenAI has LLM quota but not embedding quota
- Cost savings (free local embeddings with cloud LLM)
- Privacy (local embeddings, cloud LLM quality)
# Example: OpenAI for LLM, Ollama for embeddings
export EDGEQUAKE_LLM_PROVIDER=openai
export EDGEQUAKE_EMBEDDING_PROVIDER=ollama
export OPENAI_API_KEY=sk-...
export OLLAMA_HOST=http://localhost:11434
Troubleshooting Quick Reference
| Problem | Quick Fix |
|---|---|
| "DATABASE_URL not set" | Run make dev instead of cargo run |
| "Connection refused on 8080" | Check PostgreSQL: make postgres-start |
| "Entity extraction failed" | Start Ollama: ollama serve & |
| "Model not found" | Pull model: ollama pull gemma3:latest |
| "Port 3000 in use" | Kill stale process: lsof -ti:3000 | xargs kill |
| Tests failing | Run cargo test -p <crate> --lib for details |
| "Embedding quota exceeded" | Use hybrid mode: EDGEQUAKE_EMBEDDING_PROVIDER=ollama |
Best Practices (Mission Learnings)
- Always use Makefile commands - They set required environment variables
- Check
make statusbefore debugging - Verify all services are healthy - DATABASE_URL is mandatory - In-memory mode is removed for reliability
- Ollama must be running - Entity extraction depends on it
- Use
gpt-5-nano- If using OpenAI, avoid deprecatedgpt-4o-mini - Run tests after changes -
cargo test -p <crate> --libfor quick feedback - Commit frequently - Small, tested changes are easier to debug
- Use hybrid mode for quota issues - OpenAI LLM + Ollama embeddings
LLM Provider Configuration
EdgeQuake supports multiple LLM providers with automatic environment-based selection:
- Mock Provider: Used by default for testing (free, fast, no API key required)
- OpenAI Provider: Automatically used when
OPENAI_API_KEYis set- Recommended model:
gpt-5-nano(cost-effective, excellent for entity extraction) - Alternative:
gpt-4o-miniis deprecated; migrate togpt-5-nano - Recommended embedding:
text-embedding-3-small(1536 dimensions)
- Recommended model:
- Ollama/LM Studio: Use OpenAI-compatible API mode
Coding Style & Naming Conventions
- Follow Rust standard style guide and formatting with
rustfmt - Use
clippyfor linting and follow its suggestions - Prefer idiomatic Rust patterns: Result
, Option , async/await - Use
tracingcrate for logging, notprintln! - Entity names should be normalized: UPPERCASE with underscores (e.g., "SARAH_CHEN")
- Module names: lowercase with underscores (e.g.,
entity_extraction) - Struct/Enum names: PascalCase (e.g.,
EntityExtractor,GraphStorage) - Front-end code: TypeScript with two-space indentation, functional React components
Testing Guidelines
- Tests live in
tests/directories within each crate - E2E tests in
edgequake/crates/edgequake-core/tests/ - Use
#[tokio::test]for async tests - Tests automatically use mock provider unless
OPENAI_API_KEYis set - Integration tests can be marked with
#[cfg(feature = "integration")] - Run specific test:
cargo test --package edgequake-core --test e2e_pipeline - UI tests:
bun test
Production LLM Integration
✅ Status: PRODUCTION READY
The system now supports real LLM providers for production deployment:
-
Environment-Based Selection:
# Development/CI: Uses mock provider (free, fast) cargo test # Production: Uses real OpenAI provider export OPENAI_API_KEY="sk-your-key" cargo test -
Provider Factory Pattern:
- Automatically detects
OPENAI_API_KEYenvironment variable - Falls back to smart mock if no API key present
- No code changes needed between dev and prod
- Automatically detects
-
Quality Validation:
- Real LLM: 20 entities → 12 unique nodes (40% deduplication)
- Mock LLM: 9 entities → 6 unique nodes (33% deduplication)
- Real LLM extracts 2-3x more entities with better quality
-
Documentation:
- Complete guide:
docs/production-llm-integration.md(900+ lines) - Production readiness:
docs/PRODUCTION_READY.md - Working example:
examples/production_pipeline.rs
- Complete guide:
Commit & Pull Request Guidelines
- Use concise, imperative commit subjects (e.g.,
Fix entity normalization) - PRs should include summary, operational impact, and linked issues
- Verify
cargo clippy,cargo test, andcargo fmt --checkpass - For UI changes, ensure
bun testpasses - Document any new environment variables in
.env.example
Security & Configuration Tips
- Never commit API keys or secrets
- Use environment variables for configuration (OPENAI_API_KEY, DATABASE_URL, etc.)
- Copy
.env.exampleto.envfor local development - PostgreSQL connections should use connection pooling
- Rate limit API calls to LLM providers
- Monitor costs and usage for production deployments
Automation & Agent Workflow
- Use absolute paths for file operations
- Prefer
cargo testover manualrustcinvocations - Run
cargo clippybefore suggesting code changes - For LLM testing, check for
OPENAI_API_KEYenvironment variable - Validate changes by running relevant test suite
- Keep generated code idiomatic Rust (use Result
, avoid unwrap() in production) - Follow the LightRAG entity extraction algorithm for consistency
Claude Skills
This repository includes reusable SKILL definitions in .github/skills/ for common development workflows:
Available Skills
| Skill | Location | Purpose |
|---|---|---|
| makefile-dev-workflow | .github/skills/makefile-dev-workflow/SKILL.md | Unified development workflow using Makefile commands. Use for starting services, running E2E tests, and managing the full development stack (database, backend, frontend). Start here for dev setup. |
| doc-traceability-validator | .github/skills/doc-traceability-validator/SKILL.md | Validate FEAT/BR/UC traceability chain (224 features, 100% coverage). Detect undocumented features, duplicate IDs, namespace violations, broken references. Distinguishes cross-cutting duplicates (OK) from true collisions (FIX). Use for documentation audits. |
| pdf-markdown-validator | .github/skills/pdf-markdown-validator/SKILL.md | Validate PDF to Markdown conversion quality using multi-dimensional metrics (table accuracy, style preservation, robustness, performance). Use when measuring conversion fidelity and tracking improvements. |
| playwright-ux-ui-capture | .github/skills/playwright-ux-ui-capture/SKILL.md | Capture EdgeQuake WebUI routes with Playwright and write artifacts (screenshots + request JSON). Use when automating UI screenshot collection or updating E2E capture specs. |
| reverse-documentation | .github/skills/reverse-documentation/SKILL.md | Automatically generate comprehensive documentation for Rust and TypeScript codebases by analyzing code structure, patterns, and relationships. Supports trait-based patterns, async operations, and React components. |
| ux-ui-analyze-single-page | .github/skills/ux-ui-analyze-single-page/SKILL.md | Analyze individual pages with Playwright for UX/UI improvements. Use when evaluating specific routes or components. |
| ux-ui-map-page-by-page | .github/skills/ux-ui-map-page-by-page/SKILL.md | Map entire application UI across all pages with Playwright. Use when auditing complete application UX/UI. |
| copilotkit-nextjs-integration | .github/skills/copilotkit-nextjs-integration/SKILL.md | Integrate CopilotKit AI components into Next.js frontend. Use when adding AI-powered UI features. |
Quick reference for common tasks
Getting started with development:
make dev # Start full stack (database + backend + frontend)
make status # Check service health
make stop # Stop all services
See: makefile-dev-workflow SKILL
Validating documentation traceability:
# Validate FEAT IDs in code match docs/features.md
python3 .github/skills/doc-traceability-validator/scripts/validate_features.py \
--code-dir edgequake_webui/src \
--docs-file docs/features.md \
--verbose
# Check namespace violations (wrong team IDs)
python3 .github/skills/doc-traceability-validator/scripts/check_namespace.py \
--code-dir edgequake_webui/src
# Generate missing feature entries from code
python3 .github/skills/doc-traceability-validator/scripts/generate_registry.py \
--code-dir edgequake_webui/src \
--existing docs/features.md
See: doc-traceability-validator SKILL
Running E2E tests:
cd edgequake_webui && pnpm exec playwright test markdown-test.spec.ts
See: makefile-dev-workflow SKILL → E2E Testing section
Validating PDF → Markdown conversions:
python3 .github/skills/pdf-markdown-validator/scripts/validate.py \
--pdf-dir edgequake/crates/edgequake-pdf/test-data \
--gold-dir edgequake/crates/edgequake-pdf/test-data \
--verbose
See: pdf-markdown-validator SKILL
Capturing UI screenshots:
cd edgequake_webui && npx playwright test e2e/<spec>.spec.ts
See: playwright-ux-ui-capture SKILL
Use SRP and DRY principles when developing new features or fixing bugs. For example, if you find yourself copying and pasting code, consider refactoring it into a reusable function or module. This not only reduces code duplication but also makes maintenance easier in the long run. Always aim for clean, modular code that adheres to the project's coding standards and conventions.