Reranker Sanity Checklist

March 6, 2026 · View on GitHub

Scope: validate a hybrid pipeline where BM25 and ANN feed a deterministic reranker.


Pre checks

  • Persist BM25 and ANN candidate lists before reranking.
  • Fix a random seed and record model version for the reranker.
  • Store per-candidate features needed by the reranker.
  • Ensure identical tokenization between HyDE prompt and server.

Refs:
Rerankers · Query parsing split


Baseline health

  • BM25 top k recall on the gold set ≥ 0.60 at k = 50.
  • ANN top k recall on the gold set ≥ 0.60 at k = 50.
  • Overlap Jaccard(BM25, ANN) logged for k = 50 to detect split.

If overlap < 0.20 and both recalls are good, a query split is likely.
Route using the split detector or run two paths then merge.


Reranker behavior

  • Uplift over the better baseline ≥ 5 percent recall at k = 10.
  • Ranking stable across 3 paraphrases. Kendall tau ≥ 0.60.
  • Tie breaking rule fixed and documented.
  • Deterministic output given the same candidates and seed.

Acceptance targets

  • Coverage ≥ 0.70 on the gold set after reranking at k = 10 or 20.
  • ΔS(question, top1) ≤ 0.45 for at least 70 percent of queries.
  • λ_observe convergent across 3 paraphrases and 2 seeds.

Debug sequence

  1. Log both candidate lists and measure overlap.
  2. If split detected, enable the split route or increase k per branch.
  3. Try a simpler deterministic reranker first.
  4. Re-measure uplift and stability.

Refs:
Retrieval playbook · ΔS probes


🔗 Quick-Start Downloads (60 sec)

ToolLink3-Step Setup
WFGY 1.0 PDFEngine Paper1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS)TXTOS.txt1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

Explore More

LayerPageWhat it’s for
⭐ ProofWFGY Recognition MapExternal citations, integrations, and ecosystem proof
⚙️ EngineWFGY 1.0Original PDF tension engine and early logic sketch (legacy reference)
⚙️ EngineWFGY 2.0Production tension kernel for RAG and agent systems
⚙️ EngineWFGY 3.0TXT based Singularity tension engine (131 S class set)
🗺️ MapProblem Map 1.0Flagship 16 problem RAG failure taxonomy and fix map
🗺️ MapProblem Map 2.0Global Debug Card for RAG and agent pipeline diagnosis
🗺️ MapProblem Map 3.0Global AI troubleshooting atlas and failure pattern map
🧰 AppTXT OS.txt semantic OS with fast bootstrap
🧰 AppBlah Blah BlahAbstract and paradox Q&A built on TXT OS
🧰 AppBlur Blur BlurText to image generation with semantic control
🏡 OnboardingStarter VillageGuided entry point for new users

If this repository helped, starring it improves discovery so more builders can find the docs and tools.
GitHub Repo stars