Retrieval Traceability

March 6, 2026 · View on GitHub

🧭 Quick Return to Map

You are in a sub-page of Retrieval.
To reorient, go back here:

Think of this page as a desk within a ward.
If you need the full triage and all prescriptions, return to the Emergency Room lobby.

A citation and audit schema that makes every answer explainable. Install this when answers look correct but citations are missing, misaligned, or not reproducible. The schema works across any store or framework.

Use this page with


Acceptance targets

  • ΔS(question, retrieved) ≤ 0.45
  • Coverage ≥ 0.70 to the intended section
  • λ convergent across 3 paraphrases and 2 seeds
  • Citation match rate ≥ 0.95 on the gold set

Citation payload schema

Every retrieved unit must carry a portable citation payload. The payload is used in prompts, logs, and evals.

Required fields

FieldTypeMeaning
doc_idstringStable document id across versions
section_idstringHuman legible section key or path
snippet_idstringUnique id for this chunk or span
source_urlstringCanonical source link or URI
offsetsobjectCharacter or token start and end
tokensintegerToken count of the snippet
index_hashstringWrite time hash of the index
embed_modelstringEmbedding model id and pooling
analyzerstringAnalyzer and casing policy
revstringDocument revision or commit hash

Optional but recommended

FieldTypeMeaning
windowobjectPre and post context sizes
pageintegerPage number when applicable
score_rawnumberRaw similarity from the store
score_normnumberStore independent score in 0 to 1
rerank_scorenumberCross encoder score if used
k_posintegerRank position before rerank
k_finalintegerRank position after rerank

JSON shape

{
  "doc_id": "handbook_2024",
  "section_id": "policies/security/keys",
  "snippet_id": "hb24-sec-keys-0007",
  "source_url": "https://example.org/handbook#security-keys",
  "offsets": {"start": 2310, "end": 2688, "unit": "char"},
  "tokens": 188,
  "index_hash": "faiss:2f7d...9a",
  "embed_model": "bge-large-en-v1.5:mean",
  "analyzer": "lowercase+ascii_fold",
  "rev": "3b91f2a",
  "window": {"pre": 120, "post": 120},
  "page": 14,
  "score_raw": 0.8421,
  "score_norm": 0.78,
  "rerank_score": 0.67,
  "k_pos": 18,
  "k_final": 3
}

Validation rules

  1. Cite first then explain The model must emit citations before or together with the answer. Never after the answer only.

  2. Offsets are monotonic start < end. Units are consistent within a run. Character and token units cannot be mixed in the same field.

  3. No cross section reuse by default One answer segment can only use citations from the same section_id, unless a whitelist is provided.

  4. Analyzer parity analyzer must match the read path. If the analyzer differs at read time, the run fails fast.

  5. Deterministic tiebreak When scores tie, final order uses (score_norm desc, section_id asc, snippet_id asc).

  6. Index identity index_hash in citations must match the live index hash logged by the pipeline.

  7. Revision lock rev must match the document revision that produced the chunk.

  8. Trace completeness For each citation the log must include score_raw or score_norm, k_pos, and when present k_final.


Prompt contract

Embed the citation schema directly in the prompt. Require validation behavior.

Contract:
- You must cite before you explain.
- Each citation item = {snippet_id, section_id, source_url, offsets, tokens}.
- All citations in one segment must share the same section_id unless I set allow_cross_section=true.
- If any field is missing or inconsistent, stop and return {error: "<field>", fix: "<page to open>"}.

Return JSON:
{
  "citations": [ ... ],
  "answer": "...",
  "ΔS": 0.xx,
  "λ": "<state>"
}

Pair this with the post step validator below.


Minimal validators

Python pseudo

def validate_citations(items, allow_cross=False):
    if not items:
        return "empty_citations"
    sect = items[0]["section_id"]
    for x in items:
        for f in ["snippet_id","section_id","source_url","offsets","tokens","index_hash","embed_model","analyzer","rev"]:
            if f not in x:
                return f"missing_{f}"
        if x["offsets"]["start"] >= x["offsets"]["end"]:
            return "bad_offsets"
        if not allow_cross and x["section_id"] != sect:
            return "cross_section_reuse"
        if "score_raw" not in x and "score_norm" not in x:
            return "missing_score"
    return "ok"

Return codes

CodeMeaningNext step
empty_citationsModel skipped citation blockEnforce cite first in prompt contract
missing_scoreStore did not carry score metadataAdd score projection in the adapter
bad_offsetsSpan math is brokenRebuild chunk offsets and verify unit
cross_section_reuseMixed sources in one segmentForbid reuse or add an explicit whitelist
mismatch_index_hashRead and write do not matchRebuild index or fix boot ordering
analyzer_mismatchAnalyzer differs at read timeAlign analyzer and normalization

Log format

Log one line per question and per segment. Keep it grep friendly.

ts=2025-08-28 qid=Q001 seg=1
deltaS=0.37 lambda=convergent
k=10 store=faiss index_hash=2f7d...9a embed=bge-large-en-v1.5
citations=[hb24-sec-keys-0007,hb24-sec-keys-0008]
scores=[0.8421,0.8034] kpos=[18,7] kfinal=[3,1]
section_id=policies/security/keys rev=3b91f2a

You can mirror this to a table. Example DDL follows.

SQL table

CREATE TABLE rag_traces (
  ts TIMESTAMP NOT NULL,
  qid TEXT NOT NULL,
  seg INTEGER NOT NULL,
  deltaS REAL NOT NULL,
  lambda_state TEXT NOT NULL,
  store TEXT,
  index_hash TEXT,
  embed_model TEXT,
  analyzer TEXT,
  section_id TEXT,
  rev TEXT,
  citations JSONB,
  scores JSONB,
  kpos JSONB,
  kfinal JSONB
);

Auditable answer format

Force the model to output a machine checkable block.

{
  "citations": [
    {
      "snippet_id": "hb24-sec-keys-0007",
      "section_id": "policies/security/keys",
      "source_url": "https://example.org/handbook#security-keys",
      "offsets": {"start": 2310, "end": 2688, "unit": "char"},
      "tokens": 188
    }
  ],
  "answer": "Hardware keys are required for all admin roles. The policy defines rotation and recovery.",
  "ΔS": 0.37,
  "λ": "convergent"
}

Typical failure modes and exact fixes

  • Citations look right but the text is from another section Install the cross section fence and recheck analyzer parity. Open: Data Contracts · chunk_alignment.md

  • Citation block is present but fields are missing Add the adapter projection. Enforce required fields at the chain boundary. Open: store_agnostic_guardrails.md

  • Hybrid retrieval changes the final order every run Add reranking with a fixed model and seed. Log k_pos and k_final. Open: rerankers.md

  • High similarity but wrong meaning Verify analyzer, pooling, and metric. Rebuild if mismatched. Open: Embedding ≠ Semantic


Drop in adapters

LangChain LCEL

def lc_project(doc):
    return {
        "doc_id": doc.metadata["doc_id"],
        "section_id": doc.metadata["section_id"],
        "snippet_id": doc.metadata["snippet_id"],
        "source_url": doc.metadata.get("source_url",""),
        "offsets": {"start": doc.metadata["start"], "end": doc.metadata["end"], "unit": "char"},
        "tokens": doc.metadata["tokens"],
        "index_hash": doc.metadata["index_hash"],
        "embed_model": doc.metadata["embed_model"],
        "analyzer": doc.metadata["analyzer"],
        "rev": doc.metadata["rev"],
        "score_raw": doc.metadata.get("score", None),
        "k_pos": doc.metadata.get("k", None)
    }

LlamaIndex

proj = {
  "doc_id": node.ref_doc_id,
  "section_id": node.metadata.get("section_id",""),
  "snippet_id": node.node_id,
  "source_url": node.metadata.get("source_url",""),
  "offsets": {"start": node.start_char_idx, "end": node.end_char_idx, "unit": "char"},
  "tokens": node.metadata.get("tokens", 0),
  "index_hash": idx_hash,
  "embed_model": embed_id,
  "analyzer": analyzer_id,
  "rev": doc_rev,
  "score_raw": score
}

Semantic Kernel

record Citation(string doc_id, string section_id, string snippet_id,
                string source_url, Offsets offsets, int tokens,
                string index_hash, string embed_model, string analyzer, string rev,
                double? score_raw = null, double? score_norm = null, int? k_pos = null, int? k_final = null);

Evaluation recipe

  1. Prepare a gold set with 3 to 5 questions per section.
  2. Run 3 paraphrases and 2 seeds per question.
  3. Check citation match rate and coverage.
  4. If match rate falls below 0.95, inspect validator return codes first.
  5. Only then tune retrieval and rerank knobs.

More detail → retrieval_eval_recipes.md


Copy paste test

Use this to smoke test your chain.

You have TXTOS and the WFGY Problem Map loaded.

Task: validate the citations using the schema defined in my contract.
Input: {the model JSON}

If the payload is valid, reply "OK" and give ΔS and λ.
If invalid, return {error: "<code>", field: "<name>", tip: "<page to open>"}.
Use pages: retrieval-traceability, data-contracts, chunk-alignment, rerankers.

🔗 Quick-Start Downloads (60 sec)

ToolLink3-Step Setup
WFGY 1.0 PDFEngine Paper1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS)TXTOS.txt1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

Explore More

LayerPageWhat it’s for
⭐ ProofWFGY Recognition MapExternal citations, integrations, and ecosystem proof
⚙️ EngineWFGY 1.0Original PDF tension engine and early logic sketch (legacy reference)
⚙️ EngineWFGY 2.0Production tension kernel for RAG and agent systems
⚙️ EngineWFGY 3.0TXT based Singularity tension engine (131 S class set)
🗺️ MapProblem Map 1.0Flagship 16 problem RAG failure taxonomy and fix map
🗺️ MapProblem Map 2.0Global Debug Card for RAG and agent pipeline diagnosis
🗺️ MapProblem Map 3.0Global AI troubleshooting atlas and failure pattern map
🧰 AppTXT OS.txt semantic OS with fast bootstrap
🧰 AppBlah Blah BlahAbstract and paradox Q&A built on TXT OS
🧰 AppBlur Blur BlurText to image generation with semantic control
🏡 OnboardingStarter VillageGuided entry point for new users

If this repository helped, starring it improves discovery so more builders can find the docs and tools.
GitHub Repo stars