Production Operations Guide

April 13, 2026 · View on GitHub

This guide is the canonical runbook for operating @fetchkit/ffetch in production.

1. Pre-Deployment Checklist

Core configuration

Set timeout per dependency SLA (avoid using one global value for everything).
Set retries conservatively (1-3 in most systems).
Decide throwOnHttpError policy (true for strict exception flow, false for response-driven handling).
Use a runtime-appropriate fetchHandler for SSR/edge/custom environments.

Resilience plugins

Enable circuitPlugin for external dependencies.
Enable bulkheadPlugin for dependencies that can saturate under load.
Enable dedupePlugin on bursty read-heavy endpoints.
Enable hedgePlugin only for safe methods and latency-sensitive paths.
Validate plugin order assumptions when composing multiple plugins.

Observability and hooks

Instrument before/after/onError hooks for logs and metrics.
Track request latency (p50/p95/p99) and error-rate by endpoint.
Track resilience signals: circuit opens, bulkhead queue depth, retry counts.
Add request correlation IDs in transformRequest.

2. Operational Metrics

Minimum metrics to collect:

Request count by endpoint/method/status
Latency histogram (p50, p95, p99)
Error count by error class (TimeoutError, CircuitOpenError, NetworkError, RetryLimitError, etc.)
Circuit breaker open events and duration
Bulkhead activeCount, queueDepth, rejection count
Retry attempts and eventual success-after-retry rate

3. Alerting Baseline

Alert on sustained high error rate for a dependency.
Alert when circuit remains open beyond expected recovery windows.
Alert when bulkhead queue remains near capacity.
Alert when p99 latency regresses significantly from baseline.

4. Incident Playbook

Circuit open incidents

Check downstream dependency health first.
Confirm threshold/reset values are aligned with failure patterns.
Use fallback/degraded responses at app level while circuit is open.
Avoid releasing queued traffic all at once during recovery.

High latency incidents

Check bulkhead saturation (activeCount, queueDepth).
Check retry inflation (too many retries amplifying load).
If using hedging, verify delay and maxHedges are tuned for current latency distribution.

Rate-limit incidents (429)

Ensure retry strategy honors Retry-After behavior.
Reduce concurrency (bulkhead) and hedge aggressiveness.
Add app-level backpressure or queueing upstream.

5. Recommended Baseline Configs

Internal service-to-service client

createClient({
  timeout: 10_000,
  retries: 2,
  plugins: [
    circuitPlugin({ threshold: 5, reset: 30_000 }),
    bulkheadPlugin({ maxConcurrent: 20, maxQueue: 100 }),
    dedupePlugin({ ttl: 30_000 }),
  ],
})

Latency-sensitive read path

createClient({
  timeout: 5_000,
  retries: 1,
  plugins: [
    dedupePlugin({ ttl: 10_000 }),
    hedgePlugin({ delay: 75, maxHedges: 1 }),
  ],
})

api.md for full option and plugin reference
plugins.md for plugin lifecycle and ordering semantics
advanced.md for retry, circuit, and operational patterns
errorhandling.md for exact error behavior