Agent Security Harness

May 24, 2026 · View on GitHub

PyPI version Python 3.10+ Apache 2.0 License Tests ClawScan Static Analysis VirusTotal

Even if an agent is properly authenticated and authorized, can it still be manipulated into unsafe or policy-violating behavior?

470 executable security tests across 32 modules. MCP + A2A + L402 + x402 wire-protocol testing. Decision-layer attack scenarios. One pip install away.

$ agent-security test mcp --url http://localhost:8080/mcp
Running MCP Protocol Security Tests v4.2...
 MCP-001: Tool List Integrity Check [PASS] (0.234s)
 MCP-002: Tool Registration via Call Injection [PASS] (0.412s)
 MCP-003: Capability Escalation via Initialize [FAIL] (0.156s)
...
Results: 8/10 passed (80% pass rate) - see report.json

Quick Start

pip install agent-security-harness

# If 'agent-security' is not found, add ~/.local/bin to your PATH:
export PATH="$HOME/.local/bin:$PATH"
# See it work immediately — no server needed:
agent-security test mcp --simulate

# Then test your real MCP server:
agent-security test mcp --url http://localhost:8080/mcp

# Test an x402 payment endpoint
agent-security test x402 --url https://your-x402-endpoint.com

See docs/QUICKSTART.md for mock server setup, rate limiting, MCP server mode, and CI/CD integration.


Three Layers of Agent Decision Security

LayerWhat it coversExample focus
Protocol IntegrityPrevent spoofing, replay, downgrade, diversion, and malformed protocol behaviorMCP, A2A, L402, x402 wire-level tests
Operational GovernanceValidate session state, capability boundaries, platform actions, trust chains, and execution contextcapability escalation, facilitator trust, provenance, session security
Decision GovernanceTest whether an agent should act at all under its authority, confidence, scope, and policy constraintsautonomy scoring, scope creep, return-channel poisoning, normalization-of-deviance

How This Differs From Other Projects

CapabilityInvariant MCP-Scan (2K stars)Cisco MCP Scanner (865 stars)Snyk Agent Scan (2K stars)NVIDIA Garak (7K stars)This framework
What it doesScans installed MCP configs for tool poisoningYARA + LLM-as-judge for malicious toolsScans agent configs for MCP/skill securityLLM model vulnerability testingActive protocol exploitation + decision governance
ApproachStatic analysisStatic + LLM classificationConfig scanningModel-layer probingWire-protocol adversarial testing
MCP coverageTool descriptions, config filesTool descriptions, YARA rulesConfig files-18 tests: real JSON-RPC 2.0 attacks
A2A coverage----13 tests
L402/x402 coverage----85 tests
Enterprise platforms----25 cloud + 20 enterprise
APT simulation----GTG-1002 (17 tests)
Jailbreak/over-refusal---Yes50 tests (25 + 25 FPR)
AIUC-1 certification----Maps to 19 of 20 testable requirements
Research backing-Cisco blog-Papers5 DOIs + 3 NIST submissions
MCP server mode----Yes - invoke from any AI agent
Statistical testing----Wilson CIs, multi-trial
Total testsPattern matchingYARA rulesConfig checksModel probes470 active tests

Use both. Scan with Invariant MCP-Scan or Cisco MCP Scanner for static analysis. Test with this framework for active exploitation. They're complementary layers.


Research

Five peer-reviewed preprints and three NIST submissions underpin the methodology:

PublicationDOI
Constitutional Self-Governance for Autonomous AI Agents — 12 governance mechanisms, 77 days production data, 56 agents10.5281/zenodo.19162104
Detecting Normalization of Deviance in Multi-Agent Systems — First empirical demonstration that automated harnesses detect behavioral drift10.5281/zenodo.19195516
Decision Load Index (DLI): A Quantitative Framework for Agent Autonomy Risk — Measuring cognitive burden of AI agent oversight10.5281/zenodo.18217577
Normalization of Deviance in Autonomous Agent Systems — Foundational research on behavioral drift patterns10.5281/zenodo.15105866
Cognitive Style Governance for Multi-Agent Deployments — Governance mechanisms for managing cognitive style across multi-agent systems10.5281/zenodo.15106553

Constitutional Governance (WHY layer)

The constitutional-agent package provides the governance gates and hard constraints that complement this test harness. Six gates, 12 hard constraints, amendment protocol — enforced in code, not YAML policy files. pip install constitutional-agent.


Documentation

ResourceLink
Expanded Quick Startdocs/QUICKSTART.md
Full Test Inventory (470 tests)docs/TEST-INVENTORY.md
AIUC-1 Crosswalkdocs/AIUC1-CROSSWALK.md
Advanced Capabilitiesdocs/ADVANCED.md
MCP Serverdocs/mcp-server.md
CI/CD GitHub Actiondocs/github-action.md
Payment Attack Taxonomydocs/PAYMENT-ATTACK-TAXONOMY.md
Comparison (detailed)docs/COMPARISON.md
Privacy & Telemetrydocs/PRIVACY.md

Roadmap

v3.10 -- Prove It to Auditors ✅ Shipped. v4.1 -- Compliance Evidence ✅ Shipped. v4.2 -- Incident-Tested ✅ Shipped. v4.3 -- Supply Chain + Corpus ✅ Shipped. v4.4 -- Accuracy + Infrastructure ✅ Shipped. v4.4.2 -- Docs Hardening + Citations ✅ Shipped. 470 tests, 32 modules, SSP harness (8 tests), Decision Behavior Benchmark corpus (52 cases), HIDDEN_INSTRUCTION_PATTERN DRY extraction, dynamic test count, P0 bug fixes. v5.0 -- Lock the Category (H2 2026): benchmark corpus, schema standardization, longitudinal registry. Full details in ROADMAP.md.


Used By

WhoUse Case
FransDevelopment / Open Agent Trust RegistryOATR SDK v1.2.0 test fixtures (X4-021 through X4-030) -- Ed25519 attestation verification

Using the harness? Open a PR to add yourself, or tag us in your project.


Contributing

See CONTRIBUTING.md for guidelines, SECURITY_POLICY.md for security policy, and CONTRIBUTION_REVIEW_CHECKLIST.md for the PR checklist.

Citation

If you cite this work in research:

Saleme, M. K. (2026). Agent Security Harness — multi-protocol agent security testing framework. ORCID: 0009-0003-6736-1900. https://github.com/msaleme/red-team-blue-team-agent-fabric

Related Zenodo preprints: DLI (10.5281/zenodo.18217577), CSG (10.5281/zenodo.19162104), NoD (10.5281/zenodo.19195516), Beyond Identity Governance (10.5281/zenodo.19343034), Community-Driven Security (10.5281/zenodo.19343108).


License

Apache License 2.0 -- see LICENSE.