RAG

March 6, 2026 · View on GitHub

🏥 Quick Return to Emergency Room

You are in a specialist desk.
For full triage and doctors on duty, return here:

Think of this page as a sub-room.
If you want full consultation and prescriptions, go back to the Emergency Room lobby.

A focused hub for Retrieval Augmented Generation failures.
Use this folder when answers exist in the corpus but retrieval or evaluation drifts.
Each page gives guardrails, measurable targets, and direct links to structural fixes. No infra change required.


Orientation: what each page solves

PageWhat it fixesTypical symptom
retrieval_drift.mdKeeps retrieve → rerank → reason alignedCorrect facts exist but never show up in the top k
hallucination_rag.mdBlocks free text invention inside RAGCitations look right but answer adds content not in source
citation_break.mdEnforces cite then explain schemaLinks point to the wrong snippet or disappear on retry
hybrid_failure.mdMakes BM25 + ANN + reranker agreeHybrid worse than a single retriever
index_skew.mdRecovers broken or stale indexesIndex looks healthy yet recall is low
context_drift.mdStabilizes header order and prompt stateAnswers flip between runs with only header changes
entropy_collapse.mdCaps chain growth and noise in long flowsSteps balloon, chain never lands
eval_drift.mdMakes eval runs deterministicMetrics vary across identical replays

When to use this folder

  • Correct facts exist in the corpus but never appear in answers
  • Citations break, hallucinations creep in, or snippets drift
  • Hybrid retrievers perform worse than single retrievers
  • Index looks healthy but coverage remains low
  • Evaluation metrics vary across identical runs

Acceptance targets

  • ΔS(question, retrieved) ≤ 0.45
  • Coverage of target section ≥ 0.70
  • λ_observe convergent across 3 paraphrases and 2 seeds
  • Eval variance ≤ 0.05 across 5 replays

Symptoms → exact fixes

SymptomLikely causeOpen this
High similarity yet wrong meaningmetric or analyzer mismatchVectorstore Fragmentation · Embedding ≠ Semantic
Correct section never retrievedfragmented store or missing anchorsretrieval_drift.md · citation_break.md
Hybrid worse than singlequery split or mis weighted rerankhybrid_failure.md
Citations unstable or missingschema not enforcedcitation_break.md
Answers flip between runsprompt header reordering or λ variancecontext_drift.md
Index “healthy” but recall lowstale build, analyzer mismatchindex_skew.md
Eval scores noisy across replaysnon deterministic eval patheval_drift.md

60 second fix checklist

  1. Lock metrics and analyzers
    One embedding family per field. One distance metric. Same analyzer on write and read.
    Use: Vector DBs & Stores

  2. Enforce the snippet contract
    Required: snippet_id, section_id, source_url, offsets, tokens.
    Use: Retrieval Traceability · Data Contracts

  3. Measure ΔS and λ
    Three paraphrases, two seeds. Alert when ΔS ≥ 0.60 or λ flips.
    Use: Context Drift

  4. Add a deterministic reranker
    Keep BM25 and ANN candidate lists. Detect query split and resolve.
    Use: hybrid_failure.md

  5. Rebuild where needed
    Follow the rebuild order with a small gold set.
    Use: Retrieval Playbook


Vector DBs — jump if store specific


Minimal probe pack you can paste

Context: TXT OS and WFGY pages are loaded.

Task:
- For question Q, log ΔS(Q, retrieved) and λ across 3 paraphrases and 2 seeds.
- Enforce cite-then-explain with the traceability schema.
- If ΔS ≥ 0.60 or λ flips, return the smallest structural change that
  pushes ΔS ≤ 0.45 and coverage ≥ 0.70.
- Use BBMC, BBCR, BBPF, BBAM when relevant.

Return JSON only:
{ "citations": [...], "ΔS": 0.xx, "λ_state": "<>", "coverage": 0.xx, "next_fix": "..." }

🔗 Quick-Start Downloads (60 sec)

ToolLink3-Step Setup
WFGY 1.0 PDFEngine Paper1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY +
TXT OS (plain-text OS)TXTOS.txt1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

Explore More

LayerPageWhat it’s for
⭐ ProofWFGY Recognition MapExternal citations, integrations, and ecosystem proof
⚙️ EngineWFGY 1.0Original PDF tension engine and early logic sketch (legacy reference)
⚙️ EngineWFGY 2.0Production tension kernel for RAG and agent systems
⚙️ EngineWFGY 3.0TXT based Singularity tension engine (131 S class set)
🗺️ MapProblem Map 1.0Flagship 16 problem RAG failure taxonomy and fix map
🗺️ MapProblem Map 2.0Global Debug Card for RAG and agent pipeline diagnosis
🗺️ MapProblem Map 3.0Global AI troubleshooting atlas and failure pattern map
🧰 AppTXT OS.txt semantic OS with fast bootstrap
🧰 AppBlah Blah BlahAbstract and paradox Q&A built on TXT OS
🧰 AppBlur Blur BlurText to image generation with semantic control
🏡 OnboardingStarter VillageGuided entry point for new users

If this repository helped, starring it improves discovery so more builders can find the docs and tools.
GitHub Repo stars