prove-it

May 9, 2026 · View on GitHub

A Claude Code skill that challenges tests to actually catch bugs.

Install

In Claude Code, run:

/plugin marketplace add josharsh/prove-it

Or manually:

mkdir -p ~/.claude/skills/prove-it
curl -sL https://raw.githubusercontent.com/josharsh/prove-it/main/skills/prove-it/SKILL.md \
  -o ~/.claude/skills/prove-it/SKILL.md

AI-generated tests look great on paper. High coverage, all green, clean output. Then you actually read them and find tests that never call the function they're testing, assertions so loose they'd pass on any input, and expected values computed with the same logic as the implementation.

Coverage was high. Confidence was zero.

/prove-it makes Claude skeptical about its own tests. Every test gets a verdict: PASSES, WEAK, or THEATER.

Demo

You: /prove-it check

## Test Review: cart.test.ts

### "calculates total correctly"
- **Verdict:** THEATER
- **Issue:** Test computes its own total with reduce() but never calls calculateTotal()
- **Fix:** Replace manual reduce with `expect(calculateTotal(items)).toBe(35)`

### "fetches user by id"
- **Verdict:** WEAK
- **Issue:** toBeDefined() passes even if getUser returns { name: "WRONG" }
- **Fix:** Assert on specific fields: `expect(user.name).toBe('Alice')`

### "validates email format"
- **Verdict:** PASSES
- **Issue:** None — checks both valid and invalid inputs with specific values

### Summary
- 3 tests reviewed
- 1 solid, 1 weak, 1 theater

The Six Checks

Does it call the actual code? Tests that construct and assert on their own data are theater.
Would it fail on a wrong value? toBeDefined() and not.toThrow() are almost always too loose.
Is the expected value hardcoded? If the test recomputes the answer with the same logic, both can be wrong together.
Are edge cases covered? Empty input, null, boundaries, error paths -- at least 2 per function.
Does the mock match reality? Mocks that only return happy-path data hide real failures.
Is the test name honest? A test named "validates email" should actually check validation results.

How It Works

Two modes:

Automatic -- after writing any test, Claude reviews it against the six checks before showing it to you. Bad tests get fixed before you see them.
On demand -- say "prove it", "review my tests", or run /prove-it check to review existing tests.

Commands

Command	What it does
`/prove-it`	Activate. Reviews tests automatically as you write them
`/prove-it check`	Review the most recent test file right now

Testing

Tested with skillmother:

skillmother test skills/prove-it/

Uninstalling

rm -rf ~/.claude/skills/prove-it

License

MIT