Claude Code v2.1.104: System Prompt Cache 94% Increase — Diagnosis & Fix (#47528)

April 17, 2026 · View on GitHub

Claude Code v2.1.104: System Prompt Cache 94% Increase — Diagnosis & Fix (#47528)

GitHub Issue #47528 reports that system prompt cache overhead increased 94% between v2.1.98 and v2.1.104:

  • v2.1.98: 49,726 tokens (cold cache)
  • v2.1.104: 96,508 tokens (cold cache) This is a separate issue from the cache_creation inflation in #46917 (which adds ~20K to cache_creation per request). The two compound — meaning v2.1.104 users pay both penalties.
/cost

Look at cache_creation_input_tokens on your first turn:

  • Under 55K: You're on an older version or unaffected
  • 70-100K: You're likely affected by one or both issues
  • Over 100K: Both #47528 and #46917 are compounding | Metric | v2.1.98 | v2.1.104 | Change | |--------|---------|----------|--------| | Cold cache tokens | 49,726 | 96,508 | +94% | | Session startup cost | Baseline | ~2× | Doubled | | Short session penalty | Moderate | Severe | Worse | Short sessions (under 30 minutes) are hit hardest because startup cost dominates. Longer sessions amortize the overhead. Pin to v2.1.98:
npm i -g @anthropic-ai/claude-code@2.1.98

This is the last version before both the system prompt overhead increase (#47528) and the cache_creation inflation (#46917). Note: Version pinning also pins security patches. Update when Anthropic releases a fix.

  • #47528 — System prompt cache 94% increase (this issue)
  • #46917 — cache_creation +20K inflation (92+ reactions)
  • #46829 — Cache TTL silently changed 1h→5m (194+ reactions)
  • #42796 — Token consumption mega-thread (1,693+ reactions)
  • Cache Health Checker — Paste your /cost output to diagnose cache issues instantly
  • Token Checkup — 5-question diagnostic for your overall token consumption pattern These issues are symptoms of a broader problem. If you want to systematically reduce token consumption (CLAUDE.md optimization, hook-based guards, workflow design), the Token Book covers the full approach with 800 hours of measured data and copy-paste templates.

Opus 4.7 became the default model on April 16. Token consumption is up to 4x higher, and the safety classifier broke (#49618) — 20+ data loss incidents in 3 days. One-command fix: npx cc-safe-setup --opus47