Retrieval

March 6, 2026 · View on GitHub

🏥 Quick Return to Emergency Room

You are in a specialist desk.
For full triage and doctors on duty, return here:

WFGY Global Fix Map — main Emergency Room, 300+ structured fixes

WFGY Problem Map 1.0 — 16 reproducible failure modes

Think of this page as a sub-room.
If you want full consultation and prescriptions, go back to the Emergency Room lobby.

Evaluation disclaimer (retrieval)
All retrieval scores and examples in this section come from controlled setups with chosen corpora and prompts.
They help you compare retrieval strategies locally but are not universal rankings of models or vector stores.

A compact hub to stabilize retrieval quality across stacks, models, and stores.
Use this page to route symptoms to the exact structural fix and verify with measurable targets. No infra change required.

Orientation: what each page does

Page	What it solves	Typical symptom
Retrieval Playbook	End to end rebuild order and knobs	You fixed one thing and another breaks
Retrieval Traceability	Cite-then-explain schema with required fields	Citations miss the exact section or cannot be verified
Rerankers	Deterministic reranking across BM25 + ANN	Hybrid worse than single retriever
Query Parsing Split	One query, two meanings; detect and route	Answers jump between two unrelated sections
Chunk Alignment	Chunking aligned with the model’s semantic window	Snippets cut mid-thought; anchors missing
ΔS Probes	Quick health check using ΔS and λ_observe	Looks fine by eye but flips across runs
Retrieval Eval Recipes	Deterministic, SDK-free evaluation	No stable way to tell if “better” shipped
Store-Agnostic Guardrails	Locks for metrics, analyzers, versions	Index “healthy” but recall still low

When to use this folder

High similarity but wrong meaning.
Correct facts exist in the corpus but never show up.
Citations inconsistent or missing across steps.
Hybrid retrieval underperforms a single retriever.
Index looks healthy while coverage remains low.

Acceptance targets

ΔS(question, retrieved) ≤ 0.45
Coverage of target section ≥ 0.70
λ_observe convergent across 3 paraphrases and 2 seeds
E_resonance flat on long windows

Symptoms → exact fixes

Symptom	Likely cause	Open this
High similarity yet wrong answer	Metric or analyzer mismatch	Embedding ≠ Semantic
Correct fact never retrieved	Fragmentation or missing anchors	Vectorstore Fragmentation · Chunking Checklist
Hybrid worse than single	Query parsing split or mis-weighted rerank	Query Parsing Split · Rerankers
Citations missing or unstable	Schema not enforced	Retrieval Traceability · Data Contracts
Answers flip between runs	Prompt header reordering or λ variance	Context Drift · Rerankers

60-second fix checklist

Lock metrics and analyzers
One embedding model per field. One distance metric. Same analyzer for write and read.
Guide: Store-Agnostic Guardrails
Enforce the snippet contract
Require snippet_id, section_id, source_url, offsets, tokens.
Guide: Data Contracts
Measure ΔS and λ
Run three paraphrases and two seeds.
Guide: ΔS Probes
Sweep k and rerankers
Try k in {5, 10, 20}. Keep BM25 and ANN candidate lists.
Guide: Rerankers
Rebuild where needed
Follow the sequence in the playbook and re-test coverage.
Guide: Retrieval Playbook

Checklists — copy before deploy

Checklist	Scope	Link
Retrieval Readiness	Pre-flight: embeddings, analyzers, index, gold set	retrieval_readiness.md
Reranker Sanity	Hybrid reranking health and overlap checks	reranker_sanity.md
Traceability Gate	Contract enforcement for cite-then-explain	traceability_gate.md

Vector DBs — jump if store specific

Family index:
Vector DBs & Stores
Direct store guides:
FAISS · Chroma · Qdrant · Weaviate · Milvus · pgvector · Redis · Elasticsearch · Pinecone · Typesense · Vespa

Minimal probe pack you can paste

Context: I loaded TXT OS and the WFGY pages.

Task:
- Given question "Q", log ΔS(Q, retrieved) and λ across 3 paraphrases.
- Enforce cite-then-explain with the traceability schema.
- If ΔS ≥ 0.60 or λ flips, return the smallest structural change to push ΔS ≤ 0.45 and coverage ≥ 0.70.
- Use BBMC, BBCR, BBPF, BBAM when relevant.

Return JSON:
{ "citations": [...], "ΔS": 0.xx, "λ_state": "<>", "coverage": 0.xx, "next_fix": "..." }

🔗 Quick-Start Downloads (60 sec)

Tool	Link	3-Step Setup
WFGY 1.0 PDF	Engine Paper	1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS)	TXTOS.txt	1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

Explore More

Layer	Page	What it’s for
⭐ Proof	WFGY Recognition Map	External citations, integrations, and ecosystem proof
⚙️ Engine	WFGY 1.0	Original PDF tension engine and early logic sketch (legacy reference)
⚙️ Engine	WFGY 2.0	Production tension kernel for RAG and agent systems
⚙️ Engine	WFGY 3.0	TXT based Singularity tension engine (131 S class set)
🗺️ Map	Problem Map 1.0	Flagship 16 problem RAG failure taxonomy and fix map
🗺️ Map	Problem Map 2.0	Global Debug Card for RAG and agent pipeline diagnosis
🗺️ Map	Problem Map 3.0	Global AI troubleshooting atlas and failure pattern map
🧰 App	TXT OS	.txt semantic OS with fast bootstrap
🧰 App	Blah Blah Blah	Abstract and paradox Q&A built on TXT OS
🧰 App	Blur Blur Blur	Text to image generation with semantic control
🏡 Onboarding	Starter Village	Guided entry point for new users

If this repository helped, starring it improves discovery so more builders can find the docs and tools.