Claude Code Token Troubleshooting Guide

April 17, 2026 · View on GitHub

Claude Code Token Troubleshooting Guide — fix quota drain, cache bugs, 1M context trap (Apr 2026)

NEW: Version Health Check — enter your CC version to see which known issues affect you.

Claude Code Token Troubleshooting Guide

When your Claude Code quota disappears too fast, check these symptoms in order.

Symptom 1: Cache Destruction

Signs: Token usage spikes mid-session without changing your workflow.

Common causes:

Edited CLAUDE.md during a session
Changed settings.json or hooks
Connected/disconnected MCP servers
git status changed (every file edit breaks cache — #47107)

Fix: Don't edit CLAUDE.md or settings.json during a session. Start a new session instead.

Symptom 2: 1M Context Window Trap (Official, Apr 2026)

Signs: Max plan quota gone in 15-19 minutes. Two questions consume 66% of quota.

Root cause: Claude Code uses 1M token context by default. Near the compact threshold (~960K tokens), each API call sends ~960K tokens. A cache miss at this scale is catastrophic.

Official response: Anthropic team pinned #45756 and is considering reducing the default to 400K.

Fix:

If you left a session for 1+ hour, don't resume — start fresh with /clear
Don't keep multiple terminal sessions open (idle sessions consume quota silently)
After /compact, start working immediately (don't leave and come back)

Symptom 3: Cache TTL Silent Regression

Signs: Costs increased 20-32% since March 2026 without any workflow changes.

Root cause: Anthropic silently reduced prompt cache TTL from 1 hour to 5 minutes (#46829).

Fix: Keep sessions active. If you take a 10-minute break, your cache is gone. Consider shorter, focused sessions instead of long ones with breaks.

Symptom 4: cache_creation Inflation (v2.1.100+)

Signs: ~20K extra tokens per turn compared to v2.1.98.

Details: #46917 (60+ reactions). Same payload produces significantly more cache_creation tokens on newer versions.

Fix: Monitor with /cost. If cache_creation seems high, this is a known server-side issue being tracked.

Symptom 5: System Prompt Cache 94% Increase (v2.1.104+)

Signs: First-turn cache_creation jumps to ~96K tokens (was ~50K on v2.1.98).

Details: #47528. System prompt overhead nearly doubled. Compounds with Symptom 4. Short sessions hit hardest.

Fix: Pin to v2.1.98: npm i -g @anthropic-ai/claude-code@2.1.98. Full diagnosis.

Symptom 6: Subagent File Conflicts

Signs: 100K+ tokens consumed in retry loops.

Root cause: Multiple subagents editing the same file → "File modified since read" → retry loop (#46968).

Fix: Give each subagent a separate file scope. Run same-file operations sequentially, not in parallel.

Quick Diagnosis

Symptom	First check	Action
Sudden quota drop	`/cost` cache rate	If < 80%, cache is broken → new session
Gradual increase	Compare v2.1.98 vs current	cache_creation inflation → track #46917
High first-turn cost	First `/cost` after session start	96K+ = #47528 system prompt inflation → pin v2.1.98
Post-break spike	How long was the break?	> 5 min = cache expired (TTL regression)
Two questions, 66% gone	Context window size	1M trap → `/clear` and restart

Free Diagnostic Tools

Token Checkup — Diagnose consumption patterns
Cache Health Checker — Check cache read rate
Security Checkup — Check for deny rules bypass

Full Guide

For comprehensive token optimization (CLAUDE.md patterns, hook-based automation, workflow design, real before/after data, copy-paste templates):

Token Book — Cut Your Claude Code Token Usage in Half (¥2,500 / ~$17, 10 chapters, 44,000 words)

From cc-safe-setup — 667 hooks, 9,200+ tests. Install: npx cc-safe-setup