Deployment checklist

March 6, 2026 · View on GitHub

Purpose: a short, rigorous checklist to verify your environment and reduce bootstrap/dependency issues during deployment.


Before you deploy (pre-flight)

1) Environment & prerequisites

  • Kubernetes cluster accessible; kubectl points to correct context.
    kubectl config current-context
    kubectl get nodes
    
    
    
  • Ensure cluster resources: CPU / memory / ephemeral storage for vectorstore. Confirm quotas.
  • Secrets: API keys (LLM), db credentials, vectorstore creds in k8s Secret or vault.
  • Helm chart / manifests: reviewed and values set for production (replicas, resources, liveness/readiness).

2) Configuration sanity

  • values.yaml contains:

    • resources.requests and limits for retriever/generator.
    • replicaCount >= 2 for critical services (if expected load > small).
    • readinessProbe and livenessProbe configured.
  • Vector store sizing: index_shards, disk IOPS, memory (embedding index memory).

  • Network egress rules for model API (if external LLM).

3) Observability & alarms

  • Prometheus scraping configured for app metrics endpoints (/metrics).
  • Default dashboards in Grafana (latency, error-rate, retriever QPS, CHR).
  • Alerts configured (see live_monitoring_rag.md for suggested alerts).

Deploy steps

  1. Create namespace & secrets:

    kubectl create ns rag-prod || true
    kubectl -n rag-prod apply -f k8s/secrets.yaml
    
  2. Install/upgrade Helm chart:

    helm upgrade --install rag . -n rag-prod -f values.prod.yaml
    
  3. Wait for pods to be ready (watch):

    kubectl -n rag-prod rollout status deploy/rag-api -w
    kubectl -n rag-prod get pods -o wide
    
  4. Smoke tests (simple requests):

    curl -fsS http://<ingress>/healthz
    curl -fsS -X POST http://<ingress>/api/qa -d '{"qid":"smoke-1","q":"Who is the CEO of WFGY?" }' | jq
    

Post-deploy checks (first 15 minutes)

  • Confirm retriever returns docs for 10 sample queries:

    • Use your retrieval debug endpoint to inspect retrieved_ids.
  • Confirm p95 E2E latency ≤ target (by env). Collect from Grafana or kubectl logs.

  • Confirm CHR on 10 smoke items ≥ expected baseline (manually assert correctness).

  • Check for error spikes in logs:

    kubectl -n rag-prod logs -l app=rag --since=10m | egrep "ERROR|WARN" | head -n 200
    

Common config gotchas (double-check)

  • Vectorstore read-only mode accidentally set? (affects writes)
  • LLM rate-limiting / auth errors (wrong key or quota).
  • Wrong index/namespace names between chunker and retriever (off-by-one).
  • Probes misconfigured — containers get restarted continuously.

Rollback criteria

Rollback if any of:

  • P95 > target and sustained for 10m.
  • Error rate > 3× baseline and not transient.
  • Retrieval failures (empty pool) > 1% of requests.

Rollback command example:

helm rollback rag <previous_revision> -n rag-prod

Quick checklist (copy/paste)

  • Namespace created, secrets applied
  • Helm values validated (resources, probes)
  • Prometheus/Grafana dashboards in place
  • Smoke tests passed (health & basic QA)
  • Alerts deployed
  • Canary traffic small → monitor 10–30 min


🔗 Quick-Start Downloads (60 sec)

ToolLink3-Step Setup
WFGY 1.0 PDFEngine Paper1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS)TXTOS.txt1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

Explore More

LayerPageWhat it’s for
⭐ ProofWFGY Recognition MapExternal citations, integrations, and ecosystem proof
⚙️ EngineWFGY 1.0Original PDF tension engine and early logic sketch (legacy reference)
⚙️ EngineWFGY 2.0Production tension kernel for RAG and agent systems
⚙️ EngineWFGY 3.0TXT based Singularity tension engine (131 S class set)
🗺️ MapProblem Map 1.0Flagship 16 problem RAG failure taxonomy and fix map
🗺️ MapProblem Map 2.0Global Debug Card for RAG and agent pipeline diagnosis
🗺️ MapProblem Map 3.0Global AI troubleshooting atlas and failure pattern map
🧰 AppTXT OS.txt semantic OS with fast bootstrap
🧰 AppBlah Blah BlahAbstract and paradox Q&A built on TXT OS
🧰 AppBlur Blur BlurText to image generation with semantic control
🏡 OnboardingStarter VillageGuided entry point for new users

If this repository helped, starring it improves discovery so more builders can find the docs and tools.
GitHub Repo stars