llm-sast-scanner

March 29, 2026 · View on GitHub

A general-purpose Static Application Security Testing (SAST) skill for LLM-based code vulnerability analysis. Designed to be loaded by AI coding agents (Claude Code, OpenAI Codex, etc.) to perform structured source-to-sink taint analysis across 34 vulnerability classes.

What It Does

This skill gives an LLM agent a structured, evidence-based workflow for finding security vulnerabilities in source code:

Load relevant vulnerability reference files for the target codebase
Map sources — identify all entry points where attacker-controlled data enters
Trace taint — follow data flow through transformations to potential sinks
Verify findings — apply a Judge step to eliminate false positives
Report — produce actionable findings with file path, line number, and remediation

Supports Java, Python, JavaScript/TypeScript, PHP, .NET with language-specific detection rules.

Installation

Git (recommended)

# Claude Code
git clone https://github.com/anthropic-lab/llm-sast-scanner.git
cp -r llm-sast-scanner/llm-sast-scanner/ ~/.claude/skills/

# OpenAI Codex
git clone https://github.com/anthropic-lab/llm-sast-scanner.git
cp -r llm-sast-scanner/llm-sast-scanner/ ~/.codex/skills/

Manual

Download and copy the llm-sast-scanner/ directory into your skills folder:

# Claude Code
cp -r llm-sast-scanner/ ~/.claude/skills/

# OpenAI Codex
cp -r llm-sast-scanner/ ~/.codex/skills/

Structure

llm-sast-scanner/              ← repo root
├── README.md
└── llm-sast-scanner/          ← skill directory (copy this)
    ├── SKILL.md               # 6-step workflow + Judge verification
    └── references/            # 34 vulnerability knowledge bases
        ├── xss.md
        ├── sql_injection.md
        ├── path_traversal_lfi_rfi.md
        └── ... (34 files total)

SKILL.md

The main entry point. Defines the detection workflow, taint propagation rules, and Judge verification protocol.

Advanced Usage Tips

Precompute call graph before scanning — improves cross-function reasoning and reduces missed paths
Run 2+ scanning rounds — increases recall and stabilizes findings via iterative refinement
Enforce per-finding validation — significantly reduces false positives through explicit verification

Vulnerability Coverage

34 reference files covering the following categories:

Injection

File	Vulnerability
`sql_injection.md`	SQL Injection (CWE-89)
`xss.md`	Cross-Site Scripting (CWE-79)
`ssti.md`	Server-Side Template Injection
`nosql_injection.md`	NoSQL Injection
`graphql_injection.md`	GraphQL Injection / Introspection Abuse
`xxe.md`	XML External Entity (CWE-611)
`rce.md`	Remote Code Execution / Command Injection
`expression_language_injection.md`	Expression Language Injection (SpEL, OGNL)

Access Control & Auth

File	Vulnerability
`idor.md`	Insecure Direct Object Reference
`privilege_escalation.md`	Privilege Escalation
`authentication_jwt.md`	JWT Vulnerabilities (alg:none, weak secret)
`default_credentials.md`	Hardcoded / Default Credentials
`brute_force.md`	Brute Force / Missing Rate Limiting
`business_logic.md`	Business Logic Flaws
`http_method_tamper.md`	HTTP Method Tampering
`verification_code_abuse.md`	Verification Code Abuse
`session_fixation.md`	Session Fixation (CWE-384)

Data Exposure & Crypto

File	Vulnerability
`weak_crypto_hash.md`	Weak Cryptography (CWE-327), Weak Hash (CWE-328), Weak Random (CWE-330)
`information_disclosure.md`	Sensitive Information Disclosure
`insecure_cookie.md`	Insecure Cookie Flags (CWE-614, CWE-1004)
`trust_boundary.md`	Trust Boundary Violation (CWE-501)

Server-Side Attacks

File	Vulnerability
`ssrf.md`	Server-Side Request Forgery
`path_traversal_lfi_rfi.md`	Path Traversal, LFI, RFI (CWE-22)
`insecure_deserialization.md`	Insecure Deserialization
`arbitrary_file_upload.md`	Arbitrary File Upload
`jndi_injection.md`	JNDI Injection (Log4Shell class)
`race_conditions.md`	Race Conditions / TOCTOU

Protocol & Infrastructure

File	Vulnerability
`csrf.md`	Cross-Site Request Forgery
`open_redirect.md`	Open Redirect
`smuggling_desync.md`	HTTP Request Smuggling / Desync
`denial_of_service.md`	Denial of Service / Resource Exhaustion
`cve_patterns.md`	Known CVE Patterns

Language / Platform

File	Vulnerability
`php_security.md`	PHP-specific security issues
`mobile_security.md`	Mobile security (Android / iOS)

Benchmark Results

Note: Scores are for reference only and may vary slightly depending on model compute adjustments.

Multi-Agent + Skill (Claude Opus 4.6 high, 2026-03-27)

4 Java benchmark projects scanned using Claude Opus 4.6 (high).

Scanned in parallel using 4 agents with the skill (full reference file loading + Judge verification). Blind scan — no ground truth access during analysis.

Project	Recall	Precision	F1	TP	FP
JavaSecLab	1.000	0.958	0.979	23	1
SecExample	1.000	1.000	1.000	9	0
VulnerableApp	1.000	1.000	1.000	10	0
verademo	1.000	1.000	1.000	14	0
Global	1.000	0.982	0.991	56	1

Multi-Agent + Skill (GPT-5.4 high, 2026-03-27)

Project	Recall	Precision	F1	TP	FN	FP
JavaSecLab	0.957	1.000	0.978	22	1	0
SecExample	0.889	1.000	0.941	8	1	0
VulnerableApp	0.900	0.900	0.900	9	1	1
verademo	0.929	1.000	0.963	13	1	0
Global	0.929	0.981	0.954	52	4	1