Skills - progressive-disclosure capability packs
May 21, 2026 · View on GitHub
Status: active (April 2026). Applies to: all CLI adapters spawned by Bernstein.
Motivation
Loading the full templates/roles/<role>/system_prompt.md into every
agent spawn pays the token bill on every retry and fork, whether or
not the agent exercises the deep guidance. Across 17 roles the bodies
average ~40 lines each.
A capability pack is a directory with SKILL.md (YAML frontmatter +
markdown body) and optional references/, scripts/, assets/
siblings. The resolver injects only an index (name + description
per skill) into the system prompt; agents pull the full body on demand
via load_skill when they decide a capability is relevant.
Directory layout
templates/
roles/ # legacy bodies (kept for backwards compat)
backend/
system_prompt.md
task_prompt.md
config.yaml
…
skills/ # new skill packs
backend/
SKILL.md
references/
python-conventions.md
test-patterns.md
error-handling.md
scripts/
lint.sh
qa/
SKILL.md
references/
test-strategy.md
edge-cases.md
…
Empty buckets (references/, scripts/, assets/) are omitted -
the manifest's corresponding field is just an empty list.
SKILL.md format
---
name: backend
description: Python server code, APIs, async, strict typing.
trigger_keywords: [python, backend, async, pyright]
references:
- python-conventions.md
- test-patterns.md
- error-handling.md
scripts:
- lint.sh
---
# Backend Engineering Skill
You are a backend engineer…
Descriptions stay terse (one line) because the full index ships in every spawn's system prompt. Every byte multiplies by the number of agents Bernstein launches.
Frontmatter schema
Defined by :class:bernstein.core.skills.SkillManifest (Pydantic,
extra="forbid" so typos fail loudly):
| field | type | notes |
|---|---|---|
name | str | matches ^[a-z][a-z0-9-]*$ |
description | str | 20-500 chars, shown in the index |
trigger_keywords | list[str] | optional keyword hints |
references | list[str] | files under <skill>/references/ |
scripts | list[str] | files under <skill>/scripts/ |
assets | list[str] | files under <skill>/assets/ |
version | str | defaults to "1.0.0" |
author | str|None | optional |
Validation failures raise
:class:bernstein.core.skills.SkillManifestError with the originating
path baked into the message.
Resolution flow
bernstein.core.planning.role_resolver.resolve_role_prompt is called
once per spawn. It tries three things in order:
- Skill pack -
templates/skills/<role>/SKILL.mdexists → inject the compact index plus the matched skill body. - Legacy role template - no skill pack, but
templates/roles/<role>/system_prompt.mdexists → render via the existing Jinja-style engine and inject that. - Fallback stub - neither path found →
"You are a <role> specialist."
The resolver is cached per (templates_dir, skills/ mtime) so dev
reloads pick up edits but production spawns do not re-parse 17 manifests
on every tick.
load_skill MCP tool
Registered by :mod:bernstein.mcp.server under the name load_skill:
async def load_skill(
name: str,
reference: str | None = None,
script: str | None = None,
) -> dict: ...
Returns JSON with:
name- echoed back.body-SKILL.mdbody whenreferenceandscriptare unset.available_references/available_scripts- always populated.reference_content- the requested reference's raw text (only whenreferencewas passed).script_content- the requested script's raw text.error- populated when the skill / file could not be loaded.
Every invocation emits a skill_loaded WAL event (best-effort) with
name, reference, script, source, duration_s, and
error fields.
Sources
Skills are aggregated from multiple sources into a single
:class:SkillLoader. Name collisions abort startup with
:class:DuplicateSkillError - duplicate names across sources are never
silently shadowed.
First-party
bernstein/templates/skills/ loaded by
:class:bernstein.core.skills.sources.LocalDirSkillSource.
Plugin packs
Register a zero-arg factory under bernstein.skill_sources:
[project.entry-points."bernstein.skill_sources"]
my-data-pack = "my_pack.skills:source"
Where my_pack/skills.py exposes:
from pathlib import Path
from bernstein.core.skills import SkillSource
from bernstein.core.skills.sources import LocalDirSkillSource
def source() -> SkillSource:
return LocalDirSkillSource(
Path(__file__).parent / "skills",
source_name="plugin:my-data-pack",
)
:func:bernstein.core.skills.sources.load_plugin_sources enumerates the
group at loader construction time. Broken factories log a warning and
are skipped rather than aborting startup - a noisy third-party bug
should not take down the orchestrator.
CLI
bernstein skills list # compact table of every skill
bernstein skills show backend # print SKILL.md body
bernstein skills show backend --reference python-conventions.md
bernstein skills show backend --script lint.sh
Observability
Every successful load_skill invocation emits a structured
skill_loaded event. Hook it into the WAL
(src/bernstein/core/persistence/wal.py) or a Prometheus metric
(skill_load_total{name=..., source=...},
skill_load_duration_seconds{name=...}) to see:
- Which skills get exercised vs. sit dead.
- Whether agents converge on a small core set.
- Which references are worth keeping close and which can be retired.
Dead skills (zero loads in 30 days) become candidates for deprecation.
Migration status
All 17 roles migrated to skill packs:
| role | references | notes |
|---|---|---|
backend | python-conventions, test-patterns, error-handling + lint.sh | full split |
qa | test-strategy, edge-cases | full split |
security | owasp-top-10, auth-checklist, secrets-handling | full split |
frontend | a11y, state-management | full split |
devops | ci-patterns, docker-practices | full split |
architect | adr-template, decomposition-principles | full split |
docs | docstring-style, doc-structure + check-links.sh | full split |
retrieval | hybrid-search, chunking | full split |
ml-engineer | evaluation, reproducibility | full split |
reviewer | review-rubric, feedback-tone | full split |
manager | task-api, planning-rules | full split |
vp | pivot-evaluation, cell-decomposition | full split |
prompt-engineer | - | body small, no references |
visionary | - | body is the output schema |
analyst | - | body is the scoring rubric |
resolver | - | single-purpose skill |
ci-fixer | - | single-purpose skill |
Legacy templates/roles/<role>/system_prompt.md files remain on disk
for backwards compat.