Local CLI: tools/checkyourself.py

May 29, 2026 · View on GitHub

CheckYourself is still a folder-first audit system, but the CLI is now the deterministic engine an agent can drive.

It uses only the Python standard library, sends nothing over the network, and never prints secret values. The AI still supplies judgment. The CLI supplies repeatable receipts: discovery, schemas, coverage checks, scoring, backlog ranking, validation, and the thin MCP wrapper.

Fast Start

# Backward-compatible scan path
python3 tools/checkyourself.py /path/to/your/project

# Explicit scan subcommand
python3 tools/checkyourself.py scan /path/to/your/project

# Diagnostic alias for agent workflows that use that word
python3 tools/checkyourself.py diagnostic /path/to/your/project

# Machine-readable scan
python3 tools/checkyourself.py scan . --format json --no-write

# Discover every command and schema
python3 tools/checkyourself.py describe --format json

Command Map

Command	Purpose
`describe`	Emits the full machine-readable capability manifest.
`scan`	Detects stack signals and deterministic local findings.
`diagnostic`	Alias for `scan`, kept so docs and agents can use the natural workflow word.
`scan --deep`	Adds slower validation checks for detected surfaces, including mutable GitHub Actions and dependency-update coverage.
`coverage --emit`	Writes the 20-surface coverage skeleton for an agent to fill. Use `--format json` for stdout.
`coverage --check FILE`	Checks a filled coverage artifact for completeness.
`score --findings FILE [--coverage FILE]`	Computes the deterministic Production Reality Score or a low-confidence scan-derived estimate.
`backlog --findings FILE`	Ranks the complete remediation backlog.
`next --findings FILE`	Returns the next safest unresolved approval batch.
`validate --kind KIND FILE`	Validates JSON against bundled schema contracts.
`schema NAME`	Prints a bundled JSON schema.
`init [PROJECT]`	Creates starter generated context and coverage files.
`mcp`	Runs the stdio MCP server over the same functions.

Typical Agent Pipeline

python3 tools/checkyourself.py describe --format json > CHECKYOURSELF_CAPABILITIES.generated.json
python3 tools/checkyourself.py scan . --format json --no-write > CHECKYOURSELF_SCAN.generated.json
python3 tools/checkyourself.py coverage --emit

The agent then fills CHECKYOURSELF_COVERAGE.generated.json with evidence from the full diagnostic and can run:

python3 tools/checkyourself.py coverage --check CHECKYOURSELF_COVERAGE.generated.json
python3 tools/checkyourself.py score --findings CHECKYOURSELF_SCAN.generated.json --coverage CHECKYOURSELF_COVERAGE.generated.json --format json
python3 tools/checkyourself.py backlog --findings CHECKYOURSELF_SCAN.generated.json --format json
python3 tools/checkyourself.py next --findings CHECKYOURSELF_SCAN.generated.json --format json

That makes the score and first batch reproducible. Same evidence, same score. No vibes with a clipboard.

See the field notes behind this remediation in docs/postmortems/checkyourself-field-postmortem-2026-05-29.md.

Scan

python3 tools/checkyourself.py scan /path/to/project
python3 tools/checkyourself.py scan . --json
python3 tools/checkyourself.py scan . --json -
python3 tools/checkyourself.py scan . --format json --no-write
python3 tools/checkyourself.py scan . --ci
python3 tools/checkyourself.py scan . --deep --format json --no-write

scan detects stack signals, dependencies, scripts, env files, tests, CI, risk-surface path hints, and obvious deterministic risks:

high-confidence credential-shaped values;
lower-confidence secret-like assignments without known credential shapes;
real .env files that may be committed;
missing .env.example;
missing tests;
missing CI;
payments dependencies without tests.

Package scripts are redacted before they appear in JSON or Markdown output. If a script contains a credential-shaped value, the value is replaced with [REDACTED].

Evidence includes file, line, matched pattern type, confidence, and redacted context. Known credential shapes can still create P0 findings. Name-only or assignment-only signals are lower severity so schema fields like feedbackToken do not wreck the score just for having an unfortunate name. Env example variants such as .env.dogfood.example, commented placeholders, and obvious placeholder values are treated as setup documentation instead of real local secrets.

Projects can suppress reviewed false positives with .checkyourself.yml:

version: 1
suppress:
  - id: CY-001
    reason: "feedbackToken is a UUID reference, not a credential"
    files: ["src/dispatcher/tool-registry.ts"]
    reviewed_by: simon
    reviewed_at: "2026-05-29"

Suppressed findings remain in JSON with status: "suppressed" and a suppression note, but they do not count toward severity totals or score caps.

scan --deep is still intentionally conservative. It validates a few detected surfaces instead of pretending to be a full SAST platform: mutable GitHub Action refs, missing dependency update automation, and missing sensitive-file .gitignore patterns.

The scan is not a clean bill of health. It is cheap evidence for the full CheckYourself diagnostic.

Coverage

python3 tools/checkyourself.py coverage --emit
python3 tools/checkyourself.py coverage --emit --format json > CHECKYOURSELF_COVERAGE.generated.json
python3 tools/checkyourself.py coverage --check CHECKYOURSELF_COVERAGE.generated.json

In text mode, coverage --emit writes CHECKYOURSELF_COVERAGE.generated.json in the current directory. Use --out PATH to choose a path, or --format json when another tool wants stdout.

Coverage has 20 surfaces. Each surface must be marked:

Pass;
Finding;
Unknown;
NotApplicable.

Pass needs evidence. Unknown needs missing evidence. NotApplicable needs a reason.

Scoring

python3 tools/checkyourself.py score --findings findings.json --coverage coverage.json --format json
python3 tools/checkyourself.py score --findings CHECKYOURSELF_SCAN.generated.json --format json

The score uses the weights and caps from 02_RUN_DIAGNOSTIC/scoring-method.md:

unresolved P0 caps the score at 49;
unresolved P1 caps the score at 74;
missing critical evidence caps at 84;
scores above 90 require evidence for tests, secrets, deploy/rollback, observability, auth, and data boundaries.

The result includes per_category penalties, caps applied, confidence, and the finding IDs scored.

With --coverage, the result is score_mode: "coverage-backed" and the normal coverage caps apply. Without --coverage, the CLI produces a scan-derived-estimate when the findings file is scan JSON. It uses only the evidence the scan actually observed, keeps confidence low, and returns manual_evidence_needed so nobody mistakes the estimate for launch permission.

Every score appends a receipt to .checkyourself-score-history.json beside the findings file by default. Use --history PATH to choose a file, --note to add context, or --no-history for disposable runs.

Validation

python3 tools/checkyourself.py schema scan
python3 tools/checkyourself.py validate --kind scan CHECKYOURSELF_SCAN.generated.json
python3 tools/checkyourself.py validate --kind score CHECKYOURSELF_SCORE.generated.json

Supported schema kinds:

capabilities;
scan;
coverage;
score;
backlog;
next;
report;
dashboard;
dashboard-data;
dashboard-html;
learning-plan.

Validation uses a small standard-library JSON Schema subset: required, type, enum, minimum, maximum, properties, and items.

Exit Codes

Code	Meaning
`0`	Success; no gating condition.
`1`	Gating condition: `--ci` P0, invalid artifact, or incomplete coverage.
`2`	Usage/input error.

GitHub Action

This repo includes a composite action for projects that vendor or reference CheckYourself in CI:

name: Production Readiness Check
on: [pull_request]
jobs:
  checkyourself:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@<pinned-sha>
      - uses: KyaniteLabs/checkyourself/.github/actions/checkyourself@main
        with:
          fail-on-p0: "true"
          deep: "true"

The action writes scan JSON, validates it, and fails the job when fail-on-p0 is enabled and unresolved P0 findings remain. Pin the action ref for production use.

MCP

The MCP wrapper is local stdio and thin by design:

python3 tools/checkyourself.py mcp

It exposes native tools for describe, scan, coverage_emit, coverage_check, score, backlog, next, validate, and schema.

See mcp.md.

API Decision

There is no hosted API in this repo.

The CLI is the canonical engine. MCP is a local convenience wrapper over that engine. A hosted API only makes sense if CheckYourself becomes a SaaS/team product with accounts, hosted runs, shared history, billing, or browser-only usage.