pskoett-ai-skills

May 26, 2026 · View on GitHub

A collection of skills for AI agents. Follows the Agent Skills specification. This repository is my personal skill testing ground.

Philosophy

Every skill in this collection is built around a philosophy — a principle that addresses a specific failure mode in how agents work today. plan-interview is about collaborative planning: before codebase exploration starts, user and agent run a structured interview to align on constraints, scope, risk, and success criteria — and to surface whether a preparatory refactor should come before the main change. intent-framed-agent makes execution intent explicit so scope drift becomes visible. context-surfing monitors context quality and exits cleanly before degradation corrupts output. verify-gate runs compile, test, and lint checks so the agent doesn't need you to tell it the output was wrong if a test can. self-healing turns mid-task failures into verified, reusable artifacts instead of swept-under-the-rug retries. simplify-and-harden uses the peak context at end-of-task for a focused quality and security review. self-improvement turns repeated mistakes into durable rules that persist across sessions.

The common thread: agents have peak context at specific moments — after planning, mid-execution, at completion, after learning — and these skills are designed to exploit those peaks. Each skill encodes a philosophy that agents struggle to internalize on their own, turning it into a structured workflow they can follow reliably.

If you want to improve agent output over time, you need two loops, not one. The inner loop catches failures during a running session: the agent detects a problem, verifies its work against machine signals, and — with self-healing — recovers, files the verified fix as a reusable artifact, and continues, without you touching anything. The outer loop closes gaps across sessions: you capture where the agent failed, figure out what knowledge was missing, and encode it somewhere the agent can reach next time. learning-aggregator reads accumulated learnings across sessions and surfaces patterns. harness-updater encodes those patterns as permanent rules in project instruction files. eval-creator turns promoted rules into regression tests. pre-flight-check surfaces all of this at the start of the next session — closing the loop. The knowledge gaps get smaller with every cycle as it compounds.

skill-pipeline ties these pieces together by classifying the task and routing it through the right combination at the right depth.

Install

Install as a Claude Code plugin from this repo's marketplace. Run each command from inside Claude Code:

  1. Register this repo as a plugin marketplace:
    /plugin marketplace add pskoett/pskoett-skills
    
  2. Install the plugin from that marketplace:
    /plugin install pskoett-ai-skills@pskoett-skills
    
  3. Reload so skills, agents, and hooks register:
    /reload-plugins
    

This installs the full bundle: skills, audit agents, and hooks.

Codex

The same bundle now ships as a repo-local Codex plugin from plugin/.

  1. Open this repository in Codex.
  2. Restart Codex after pulling the latest repo state so it reloads repo marketplaces.
  3. Open the plugin directory, choose the pskoett skills marketplace, and install pskoett-ai-skills.

Codex reads the marketplace from .agents/plugins/marketplace.json and the plugin manifest from plugin/.codex-plugin/plugin.json.

GitHub Copilot CLI

The same bundle ships as a Copilot CLI plugin nested under plugin/.copilot-plugin/, reusing the shared plugin/skills/, plugin/agents/, and plugin/hooks/ content:

copilot plugin marketplace add pskoett/pskoett-skills
copilot plugin install pskoett-ai-skills

Copilot reads the marketplace from .github/plugin/marketplace.json and the plugin manifest from plugin/.copilot-plugin/plugin.json.

Individual skills via GitHub CLI (gh skill)

GitHub CLI now supports Agent Skills via gh skill. Requires GitHub CLI v2.90.0 or later.

# Browse this repo's skills interactively
gh skill install pskoett/pskoett-skills

# Install specific skills directly
gh skill install pskoett/pskoett-skills verify-gate
gh skill install pskoett/pskoett-skills self-healing
gh skill install pskoett/pskoett-skills simplify-and-harden
gh skill install pskoett/pskoett-skills self-improvement

# Target a specific host and scope when needed
gh skill install pskoett/pskoett-skills verify-gate --agent codex --scope user

gh skill installs to the correct skill directory for the selected host, including GitHub Copilot, Claude Code, Codex, Cursor, and Gemini CLI.

Individual skills via the Agent Skills CLI

If you only want specific skills and not the full plugin bundle:

npx skills add pskoett/pskoett-skills/skills/verify-gate
npx skills add pskoett/pskoett-skills/skills/self-healing
npx skills add pskoett/pskoett-skills/skills/simplify-and-harden
npx skills add pskoett/pskoett-skills/skills/self-improvement

Works with any agent following the Agent Skills specification.

Manual install

Clone and copy (or symlink) the skill directories you want:

git clone https://github.com/pskoett/pskoett-skills.git
cp -r pskoett-skills/skills/verify-gate ~/.claude/skills/

Structure

skills/
  skill-name/
    SKILL.md         # Required - skill definition with YAML frontmatter
    scripts/         # Optional - executable code
    references/      # Optional - documentation loaded on demand
    assets/          # Optional - templates, images, data files

Skills

SkillDescription
agent-teams-simplify-and-hardenImplementation + audit loop using parallel agent teams with structured simplify, harden, and document passes
context-surfingMonitors context window health and rides peak context quality for maximum output fidelity during multi-step execution
intent-framed-agentCaptures a lightweight intent contract at execution start and monitors coding-task drift until resolution
plan-interviewRuns a structured interview before planning non-trivial implementations
self-healingActive runtime recovery — diagnose, patch, verify, file the verified fix when a command, test, helper, env, or external service fails mid-task
self-improvementCaptures learnings and errors with hook-based activation and automatic skill extraction
skill-pipelinePipeline orchestrator that classifies tasks and routes them through the right skill combination at the right depth
simplify-and-hardenPost-completion self-review that runs simplify, harden, and micro-documentation passes before signaling done
verify-gateMachine verification gate (compile, test, lint) between implementation and quality review with fix loop
learning-aggregatorCross-session analysis of .learnings/ files — finds patterns, ranks promotion candidates
pre-flight-checkSession-start scan that surfaces relevant learnings, errors, and eval status before work begins
eval-creatorCreates permanent eval cases from promoted learnings and runs regression checks

CI Skills (gh-aw) (beta)

Headless CI variants for GitHub Agentic Workflows. Each mirrors an interactive skill but runs without human interaction — scanning, reporting, and optionally gating PRs.

SkillDescription
self-healing-ciCI-only self-healing workflow — diagnoses failed PR checks, proposes verified patches as PR comments or label-gated commits, files HEAL entries to .learnings/HEALS.md
self-improvement-ciCI-only self-improvement workflow for recurring failure-pattern capture using gh-aw
simplify-and-harden-ciCI-only simplify/harden workflow for pull requests using gh-aw with headless scan/report gates
learning-aggregator-ciCI-only cross-session learning aggregation — scheduled pattern detection and gap reporting using gh-aw
eval-creator-ciCI-only eval regression runner — per-PR eval checks and scheduled eval creation from promoted patterns using gh-aw

Two Loops

The skills implement two feedback loops that improve agent output over time.

Inner loop (within a session): detect → verify → recover Outer loop (across sessions): inspect → encode → regress-test

Each skill prevents a distinct failure mode:

SkillLoopFailure it prevents
plan-interviewBuilding the wrong thing
intent-framed-agentInner (detect)Scope creep during execution
context-surfingInner (detect + recover)Degraded-context corruption
verify-gateInner (verify + recover)Shipping code that doesn't compile or pass tests
self-healingInner (recover)Mid-task failures becoming silent recurrences instead of verified, reusable fixes
simplify-and-hardenInner (detect)Shipping rough/insecure code
self-improvementBridge (capture)Repeating the same mistakes
pre-flight-checkBridge (surface)Starting work blind to known patterns
learning-aggregatorOuter (inspect)Accumulated learnings nobody reads
harness-updaterOuter (encode)Patterns that never become rules
eval-creatorOuter (regress-test)Fixed issues that silently regress

Inner Loop Lifecycle

[plan-interview] → [intent-framed-agent] ⟂ [context-surfing] → [verify-gate] → [simplify-and-harden] → [self-improvement]
                                          ↑   concurrent    ↑    ↳ [self-healing] (on failure: diagnose → patch → verify → file HEAL)
                                                                  ↻ fix loop

Stage 1 — Planning (manual gate): plan-interview runs a structured interview and produces a plan file in docs/plans/. This is the only skill that requires explicit invocation (/plan-interview). Downstream skills activate automatically when present, but each works independently if earlier stages are skipped.

Stage 2 — Execution (concurrent monitoring): intent-framed-agent captures the intent frame and monitors scope drift. context-surfing monitors context quality drift. Both run simultaneously. If both fire at once, context-surfing's exit takes precedence — degraded context makes scope checks unreliable.

Stage 3 — Verification (machine gate): verify-gate runs the project's compile, test, and lint commands. If any fail, self-healing takes the diagnosis loop: identify root cause, write the fix (script, env tweak, alt command), verify by re-running, and file a HEAL- entry to .learnings/HEALS.md. Verify-gate then re-checks. Up to 3 attempts per phase before abandoning. Only when all checks pass does work proceed to the quality review.

Stage 4 — Review (post-completion): simplify-and-harden runs three passes (simplify, harden, document) on the completed work.

Stage 5 — Learning (automatic): self-improvement captures recurring patterns from the session to .learnings/.

Outer Loop Lifecycle

.learnings/ → [learning-aggregator] → [harness-updater] → [eval-creator]

                                              [pre-flight-check] → next session

Inspect: learning-aggregator reads all .learnings/ files, groups by pattern, and ranks promotion candidates.

Encode: harness-updater agent takes promotion candidates and applies them as rules in CLAUDE.md, AGENTS.md, and copilot-instructions.md.

Regress-test: eval-creator turns promoted patterns into permanent test cases in .evals/ and runs regression checks.

Bridge: pre-flight-check surfaces accumulated learnings and eval status at session start, feeding outer loop improvements back into the inner loop.

Artifacts at each stage

StageArtifactLocation
PlanningPlan filedocs/plans/plan-NNN-<slug>.md
ExecutionIntent frameEmitted in session output
ExecutionHandoff file (on drift exit).context-surfing/handoff-<slug>-<timestamp>.md
VerificationPass/fail signalEmitted in session output
RecoveryVerified heal entry + (lazy) artifacts.learnings/HEALS.md, .learnings/heals/<HEAL-ID>/
ReviewStructured YAML summaryAppended to task output
LearningLearning entries.learnings/LEARNINGS.md, ERRORS.md, FEATURE_REQUESTS.md
AggregationGap reportEmitted by learning-aggregator
EncodingUpdated rulesCLAUDE.md, AGENTS.md
RegressionEval cases + results.evals/EVAL_INDEX.md, .evals/cases/

Pipeline depth

Every skill works standalone. The pipeline is the recommended combination, not a hard dependency — each skill silently adapts when upstream artifacts are absent.

Match depth to complexity:

TaskSkills
Trivial (typo fix, rename)None
Small (isolated bug fix)verify-gate + self-healing (on failure) + simplify-and-harden
Medium (feature, multi-file)intent-framed-agent + verify-gate + self-healing + simplify-and-harden
Large (refactor, new architecture)Full inner loop pipeline
Long-running (multi-session)Full inner loop — context-surfing is critical
Periodic (weekly, sprint boundary)Outer loop: learning-aggregatorharness-updatereval-creator

Usage

To use a skill, add it to your agent's configuration or reference it directly.

Hook Setup

Skills with hooks register them via SKILL.md frontmatter when installed as a plugin. For standalone use, add to .claude/settings.json:

{
  "hooks": {
    "UserPromptSubmit": [{
      "matcher": "",
      "hooks": [
        {
          "type": "command",
          "command": "./skills/self-improvement/scripts/activator.sh"
        }
      ]
    }],
    "SessionStart": [{
      "matcher": "",
      "hooks": [
        {
          "type": "command",
          "command": "./skills/context-surfing/scripts/handoff-checker.sh"
        },
        {
          "type": "command",
          "command": "./skills/pre-flight-check/scripts/pre-flight.sh"
        }
      ]
    }],
    "PostToolUse": [{
      "matcher": "Bash",
      "hooks": [
        {
          "type": "command",
          "command": "./skills/self-improvement/scripts/error-detector.sh"
        }
      ]
    }]
  }
}
HookScriptSkillPurpose
UserPromptSubmitactivator.shself-improvementReminds to evaluate learnings after tasks
SessionStarthandoff-checker.shcontext-surfingDetects unread handoff files from previous context exits
SessionStartpre-flight.shpre-flight-checkSurfaces accumulated learnings, errors, and eval status
PostToolUse (Bash)error-detector.shself-improvementDetects command failures for automatic error logging

All hooks are lightweight (~50-200 tokens) and output nothing when no signals exist.

Contributing

Feel free to submit PRs with new skills or improvements to existing ones.