Policy Quick Reference

February 26, 2026 · View on GitHub

Overview

Scan policies control all tuning knobs, detection thresholds, and rule enablement in Skill Scanner. Every setting has a sensible default; custom policies merge on top so you only specify what you want to change.

This page is a compact reference. For full walkthroughs, see Custom Policy Configuration.

Presets

PresetUse case
balanced (default)Good balance of detection and false-positive rate. Broad benign allowlists, demotion in docs, known installer domains trusted.
strictLowest thresholds, most sensitive. Scans all files (no inert extension skip), no known installer demotions, narrow allowlists. Best for untrusted/external skills and compliance audits.
permissiveHighest thresholds, fewer findings, broader whitelists. Best for trusted internal skills or high-FP workflows.
skill-scanner scan --policy balanced ./my-skill
skill-scanner scan --policy strict ./my-skill
skill-scanner scan --policy /path/to/custom.yaml ./my-skill
skill-scanner generate-policy -o my_org_policy.yaml
skill-scanner configure-policy  # Interactive TUI

Use --preset strict|balanced|permissive with generate-policy to base a new file on a specific preset.

Most Common Tweaks

Copy-paste these into your policy YAML. You only need the sections you want to change.

CI strict mode

# Strict scanning for CI pipelines
analyzers:
  static: true
  bytecode: true
  pipeline: true
disabled_rules: []

Raise file limits for large projects

file_limits:
  max_file_count: 500
  max_file_size_bytes: 20971520  # 20 MB

Disable noisy rules

disabled_rules:
  - LAZY_LOAD_DEEP_NESTING
  - ARCHIVE_FILE_DETECTED
  - MANIFEST_DESCRIPTION_TOO_LONG

Override a rule severity

severity_overrides:
  - rule_id: BINARY_FILE_DETECTED
    severity: MEDIUM
    reason: "Our policy treats unknown binaries as medium risk"

Add custom benign dotfiles

hidden_files:
  benign_dotfiles:
    - ".bazelrc"
    - ".bazelversion"
    - ".terraform.lock.hcl"

Tune LLM context budgets

llm_analysis:
  max_instruction_body_chars: 40000   # double default
  max_code_file_chars: 30000
  max_total_prompt_chars: 200000
  meta_budget_multiplier: 2.0

Tighten detection thresholds

analysis_thresholds:
  zerowidth_threshold_with_decode: 30   # stricter (lower = more sensitive)
  zerowidth_threshold_alone: 150
  analyzability_low_risk: 95
  analyzability_medium_risk: 75

Section Reference

Each section below documents every field, its type, default, and what it affects. Click to expand.

file_limits — Numeric thresholds for file inventory and manifest checks
FieldTypeDefaultAffects
max_file_countint100EXCESSIVE_FILE_COUNT
max_file_size_bytesint5242880 (5 MB)OVERSIZED_FILE
max_reference_depthint5LAZY_LOAD_DEEP_NESTING
max_name_lengthint64MANIFEST_INVALID_NAME
max_description_lengthint1024MANIFEST_DESCRIPTION_TOO_LONG
min_description_lengthint20SOCIAL_ENG_VAGUE_DESCRIPTION
analysis_thresholds — Numeric thresholds for YARA and analyzability scoring
FieldTypeDefaultAffects
zerowidth_threshold_with_decodeint50Unicode steganography (with decode step)
zerowidth_threshold_aloneint200Unicode steganography (without decode)
analyzability_low_riskint90LOW_ANALYZABILITY (score >= this = LOW risk)
analyzability_medium_riskint70LOW_ANALYZABILITY (score >= this = MEDIUM risk)
min_dangerous_linesint5HOMOGLYPH_ATTACK
min_confidence_pctint80FILE_MAGIC_MISMATCH
exception_handler_context_linesint20RESOURCE_ABUSE_INFINITE_LOOP
short_match_max_charsint2Unicode steganography (short match filter)
cyrillic_cjk_min_charsint10Unicode steganography (CJK suppression)
homoglyph_filter_math_contextbooltrueSuppress scientific/math contexts in HOMOGLYPH_ATTACK
homoglyph_math_aliaseslist[str]["COMMON", "GREEK"]Allowed confusable alias groups in math contexts
pipeline — Pipeline taint and tool-chaining analysis behaviour
FieldTypeDefaultAffects
known_installer_domainssetvariousURLs demoted to LOW when curl|sh targets them
benign_pipe_targetslistregex patternsBenign pipe chains (e.g. cat | grep)
doc_path_indicatorssetreferences, docs, etc.Path segments marking documentation
demote_in_docsbooltrueDemote findings in doc paths
demote_instructionalbooltrueDemote instructional patterns (e.g. SKILL.md)
check_known_installersbooltrueDemote known installer URLs
dedupe_equivalent_pipelinesbooltrueCollapse equivalent pipeline detections from overlapping extraction passes
compound_fetch_require_download_intentbooltrueRequire explicit download intent for fetch+execute detection
compound_fetch_filter_api_requestsbooltrueSuppress API-request false positives in fetch+execute heuristics
compound_fetch_filter_shell_wrapped_fetchbooltrueSuppress shell-wrapped fetch false positives
compound_fetch_exec_prefixeslistwrapper commandsAllowed wrappers before execution sinks (for example sudo)
compound_fetch_exec_commandslistexecution sinksCommands treated as execution sinks in fetch+execute detection
exfil_hintslistsend, upload, etc.Hint words for exfiltration detection
api_doc_tokenslist@app., app., etc.Tokens suppressing tool-chaining FP
file_classification — How file extensions are classified for analysis routing
FieldTypeDefaultAffects
inert_extensionssetimages, fonts, etc.Skip binary checks on these
structured_extensionssetsvg, pdf, etc.Not flagged as unknown binary
archive_extensionssetzip, tar, etc.Flagged as archives
code_extensionssetpy, sh, js, etc.Code file detection
skip_inert_extensionsbooltrueSkip checks on inert files
allow_script_shebang_text_extensionsbooltrueAllow shebang headers for script-like text/code files
script_shebang_extensionssetscript extensionsExtensions treated as valid shebang script targets
hidden_files — Dotfile/dotdir allowlists

Dotfiles and dotdirs not in these lists trigger HIDDEN_DATA_FILE / HIDDEN_DATA_DIR findings.

FieldTypeDefaultAffects
benign_dotfilessetpreset-defined allowlistHIDDEN_DATA_FILE
benign_dotdirssetpreset-defined allowlistHIDDEN_DATA_DIR
rule_scoping — Restrict which rules fire on which file types

Reduces noise in doc-heavy skills.

FieldTypeDefaultAffects
skillmd_and_scripts_onlylistpreset-defined setRules limited to SKILL.md + scripts
skip_in_docslistpreset-defined setRules skipped in documentation directories
code_onlylistprompt_injection_unicode_steganography, sql_injection_genericRules only on code files
doc_path_indicatorssetreferences, docs, examples, etc.Directory names marking "documentation" context
doc_filename_patternslistregex patternsFilename patterns marking educational/example content
dedupe_reference_aliasesbooltrueDe-dupes duplicate script references in SKILL.md parsing
dedupe_duplicate_findingsbooltrueDe-dupes duplicate findings emitted across script/reference passes
asset_prompt_injection_skip_in_docsbooltrueSuppresses ASSET_PROMPT_INJECTION findings in doc-style paths
credentials — Suppress well-known test credentials and placeholders
FieldTypeDefaultAffects
known_test_valuessetStripe test keys, JWT.io example, common placeholdersExact-match suppression of credential findings
placeholder_markerssetyour-, example, placeholder, etc.Substring match suppression of credential findings
system_cleanup — Safe destructive cleanup targets
FieldTypeDefaultAffects
safe_rm_targetssetdist, build, tmp, node_modules, etc.DANGEROUS_CLEANUP finding suppression
command_safety — Tiered command classification for code execution findings
FieldTypeDefaultAffects
safe_commandssetread-only utilities (cat, ls, grep, etc.)Commands always considered safe
caution_commandssetcp, mv, find, git, npm, pip, etc.Commands that need context to evaluate
risky_commandssetrm, docker, ssh, kubectl, etc.Commands flagged at MEDIUM severity
dangerous_commandssetcurl, wget, eval, exec, sudo, etc.Commands flagged at HIGH/CRITICAL severity
dangerous_arg_patternslist[regex]8 patterns (inline code exec, shell spawning, etc.)Regex patterns that immediately classify a command as DANGEROUS
sensitive_files — Regex patterns matching sensitive file paths

When a pipeline reads a matching file, the taint is upgraded to SENSITIVE_DATA.

FieldTypeDefaultAffects
patternslist[regex]/etc/passwd, ~/.ssh, .env, .pem, etc.Pipeline taint upgrade to SENSITIVE_DATA
llm_analysis — LLM context budget thresholds

Controls LLM context budget thresholds for LLM and meta analyzers. Content within budget is sent in full; content exceeding the budget is skipped entirely and an LLM_CONTEXT_BUDGET_EXCEEDED INFO finding is emitted.

FieldTypeDefaultAffects
max_instruction_body_charsint20000Maximum character length for a single instruction body sent to the LLM
max_code_file_charsint15000Maximum character length for a single code file sent to the LLM
max_referenced_file_charsint10000Maximum character length for a single referenced file sent to the LLM
max_total_prompt_charsint100000Maximum total characters across the entire LLM prompt
max_output_tokensint8192Maximum output tokens for LLM responses (both LLM analyzer and meta-analyzer)
meta_budget_multiplierfloat3.0Multiplier applied to all input limits above for the meta analyzer (e.g. 3x = 60K instruction, 45K/file, 300K total)
analyzers — Enable or disable built-in analysis passes
FieldTypeDefaultAffects
staticbooltrueEnable/disable YAML+YARA pattern analyzer
bytecodebooltrueEnable/disable .pyc bytecode analyzer
pipelinebooltrueEnable/disable shell pipeline taint analyzer
finding_output — Output normalization, dedupe behavior, and traceability metadata
FieldTypeDefaultAffects
dedupe_exact_findingsbooltrueRemoves exact duplicates from overlapping analyzers
dedupe_same_issue_per_locationbooltrueCollapses same issue at same file/line/snippet/category across analyzers
same_issue_preferred_analyzerslist[str]["meta_analyzer", "llm_analyzer", ...]Chooses which analyzer's details survive same-issue collapse
same_issue_collapse_within_analyzerbooltrueIf true, also collapses same-issue findings from one analyzer
annotate_same_path_rule_cooccurrencebooltrueAdds same_path_other_rule_ids metadata for findings on the same path
attach_policy_fingerprintbooltrueAdds policy name/version/fingerprint metadata to each finding
severity_overrides — Raise or lower any rule's severity
FieldTypeDefaultAffects
severity_overrideslist[{rule_id, severity, reason}][]Override finding severity per rule
severity_overrides:
  - rule_id: BINARY_FILE_DETECTED
    severity: MEDIUM
    reason: "Our policy treats unknown binaries as medium risk"
disabled_rules — Completely suppress specific rule IDs
FieldTypeDefaultAffects
disabled_ruleslist[str][]Remove matching findings from results
disabled_rules:
  - LAZY_LOAD_DEEP_NESTING
  - ARCHIVE_FILE_DETECTED