Claude Code Token Troubleshooting Guide
April 17, 2026 · View on GitHub
Claude Code Token Troubleshooting Guide — fix quota drain, cache bugs, 1M context trap (Apr 2026)
Claude Code Token Troubleshooting Guide — fix quota drain, cache bugs, 1M context trap (Apr 2026)
Claude Code Token Troubleshooting Guide — fix quota drain, cache bugs, 1M context trap (Apr 2026)
NEW: Version Health Check — enter your CC version to see which known issues affect you.
Claude Code Token Troubleshooting Guide
When your Claude Code quota disappears too fast, check these symptoms in order.
Symptom 1: Cache Destruction
Signs: Token usage spikes mid-session without changing your workflow.
Common causes:
- Edited CLAUDE.md during a session
- Changed settings.json or hooks
- Connected/disconnected MCP servers
- git status changed (every file edit breaks cache — #47107)
Fix: Don't edit CLAUDE.md or settings.json during a session. Start a new session instead.
Symptom 2: 1M Context Window Trap (Official, Apr 2026)
Signs: Max plan quota gone in 15-19 minutes. Two questions consume 66% of quota.
Root cause: Claude Code uses 1M token context by default. Near the compact threshold (~960K tokens), each API call sends ~960K tokens. A cache miss at this scale is catastrophic.
Official response: Anthropic team pinned #45756 and is considering reducing the default to 400K.
Fix:
- If you left a session for 1+ hour, don't resume — start fresh with
/clear - Don't keep multiple terminal sessions open (idle sessions consume quota silently)
- After
/compact, start working immediately (don't leave and come back)
Symptom 3: Cache TTL Silent Regression
Signs: Costs increased 20-32% since March 2026 without any workflow changes.
Root cause: Anthropic silently reduced prompt cache TTL from 1 hour to 5 minutes (#46829).
Fix: Keep sessions active. If you take a 10-minute break, your cache is gone. Consider shorter, focused sessions instead of long ones with breaks.
Symptom 4: cache_creation Inflation (v2.1.100+)
Signs: ~20K extra tokens per turn compared to v2.1.98.
Details: #46917 (60+ reactions). Same payload produces significantly more cache_creation tokens on newer versions.
Fix: Monitor with /cost. If cache_creation seems high, this is a known server-side issue being tracked.
Symptom 5: System Prompt Cache 94% Increase (v2.1.104+)
Signs: First-turn cache_creation jumps to ~96K tokens (was ~50K on v2.1.98).
Details: #47528. System prompt overhead nearly doubled. Compounds with Symptom 4. Short sessions hit hardest.
Fix: Pin to v2.1.98: npm i -g @anthropic-ai/claude-code@2.1.98. Full diagnosis.
Symptom 6: Subagent File Conflicts
Signs: 100K+ tokens consumed in retry loops.
Root cause: Multiple subagents editing the same file → "File modified since read" → retry loop (#46968).
Fix: Give each subagent a separate file scope. Run same-file operations sequentially, not in parallel.
Quick Diagnosis
| Symptom | First check | Action |
|---|---|---|
| Sudden quota drop | /cost cache rate | If < 80%, cache is broken → new session |
| Gradual increase | Compare v2.1.98 vs current | cache_creation inflation → track #46917 |
| High first-turn cost | First /cost after session start | 96K+ = #47528 system prompt inflation → pin v2.1.98 |
| Post-break spike | How long was the break? | > 5 min = cache expired (TTL regression) |
| Two questions, 66% gone | Context window size | 1M trap → /clear and restart |
Free Diagnostic Tools
- Token Checkup — Diagnose consumption patterns
- Cache Health Checker — Check cache read rate
- Security Checkup — Check for deny rules bypass
Full Guide
For comprehensive token optimization (CLAUDE.md patterns, hook-based automation, workflow design, real before/after data, copy-paste templates):
Token Book — Cut Your Claude Code Token Usage in Half (¥2,500 / ~$17, 10 chapters, 44,000 words)
From cc-safe-setup — 667 hooks, 9,200+ tests. Install: npx cc-safe-setup