Prior art and related work

May 31, 2026 ยท View on GitHub

This document anchors when each load-bearing Vaara concept first shipped in a tagged public release, and lists adjacent published work in the same lane. It exists so that a reader comparing Vaara against newer academic or industry proposals can check the published timeline rather than relying on marketing claims.

The chronology is reconstructed from CHANGELOG.md and the git history of this repository. Dates are calendar dates of the tagged PyPI release. Version numbers and dates can be verified against https://pypi.org/project/vaara/#history and the vX.Y.Z tags on https://github.com/vaaraio/vaara/tags.

When each Vaara concept first shipped

ConceptFirst shippedWhere to read it
Interception pipeline for agent tool callsv0.1.0, 2026-04-10src/vaara/pipeline.py, CHANGELOG.md v0.1.0
Adaptive risk scoring with conformal interval on every scorev0.1.0, 2026-04-10docs/formal_specification.md, docs/conformal-prediction.md
Hash-chained audit trailv0.1.0, 2026-04-10src/vaara/audit/, COMPLIANCE.md
Framework integrations (LangChain, CrewAI, OpenAI Agents) and MCP server surfacev0.3.0, 2026-04-18src/vaara/integrations/
Signed audit-trail export and verification CLIv0.4.1, 2026-04-20src/vaara/audit/, docs/vaara-audit-cli.md
Sigstore-signed release workflow with PyPI trusted publishing and PEP 740 attestationsv0.4.3, 2026-04-21.github/workflows/release.yml, docs/signing-keys.md
Opt-in XGBoost adversarial classifier with by-seed held-out benchmarksv0.5.0, 2026-04-23src/vaara/adversarial_classifier.py, bench/
Callable kernel HTTP surface (Vaara as the schema, not the plug-in)v0.10.0, 2026-05-16docs/openapi.yaml, src/vaara/integrations/http.py
Article 12 commit-prove receipt pairv0.10.0, 2026-05-16src/vaara/audit/, CHANGELOG.md v0.10.0
vaara-bench-v1 reproducible benchmark harnessv0.12.0, 2026-05-16bench/
Hot policy reload without pipeline restartv0.13.0, 2026-05-17src/vaara/policy/, CHANGELOG.md v0.13.0
Static HTML article-coverage dashboardv0.13.0, 2026-05-17src/vaara/compliance/dashboard.py
Pluggable signer with optional ML-DSA-65 (FIPS 204) post-quantum schemev0.14.0, 2026-05-17src/vaara/audit/signer.py
External-scorer composition over the same HTTP interfacev0.14.0, 2026-05-17src/vaara/policy/composition.py
TypeScript client (@vaaraio/client) for the HTTP surfacev0.15.0, 2026-05-17clients/ts/
PDF auditor evidence export (per-article rollup)v0.16.0, 2026-05-17src/vaara/compliance/render.py
OVERT 1.0 reference verifier CLI (vaara overt verify)v0.17.0, 2026-05-17src/vaara/overt/, docs/openapi.yaml
Streaming-notification interception inside the audit and OVERT perimeterv0.25.0, 2026-05-21src/vaara/integrations/mcp_proxy.py
Per-article verdict drill-down: verdict_inputs, verdict_reasons, contributing_eventsv0.26.0, 2026-05-21src/vaara/compliance/engine.py, VERDICTS.md
SLSA build provenance attestation on every releasev0.26.0, 2026-05-21.github/workflows/release.yml
Continuous fuzzing of the OVERT decoder, audit from_dict, and policy loader via ClusterFuzzLitev0.27.0, 2026-05-22fuzz/, .clusterfuzzlite/, .github/workflows/cflite_*.yml
VERDICTS.md per-article evidence sufficiency referencev0.28.0, 2026-05-22VERDICTS.md
docs/conformal-prediction.md plain-language explainerv0.28.0, 2026-05-22docs/conformal-prediction.md
This document (PRIOR_ART.md)v0.29.0, 2026-05-24PRIOR_ART.md
Cross-model held-out methodology with public 4,176-entry eval foldv0.36.0, 2026-05-25bench/vaara-bench-v0.36.md, tests/adversarial/v036_holdout.json
Destination-aware features (dst__*) and v7 production classifierv0.36.0, 2026-05-25src/vaara/adversarial_classifier.py, scripts/train_adversarial_classifier.py

The CHANGELOG.md entry for each version carries the substantive description and, where relevant, the failure mode that motivated the change.

The following peer-review and pre-print papers describe approaches in the same lane as Vaara (runtime evidence, hash-chained or signed audit trails, conformal calibration, behavioural-constraint monitoring, safety cases with runtime updates). They are listed here as related reading, not as competitors. Where the publication post-dates Vaara's shipped feature for the same idea, that is a chronological fact rather than a judgment of the work.

Runtime evidence and behavioural monitoring

  • Protocol-Driven Development: Governing Generated Software Through Invariants and Continuous Evidence. arXiv:2605.12981v2, published 2026-05-15. Introduces an "Evidence Chain" of compliance for generated implementations and a "Dynamic Evidence Ledger" for deployed systems, with signed runtime observations appended by verifiers. Conceptually adjacent to Vaara's hash-chained audit trail with article-explicit evidence (shipped v0.1.0, 2026-04-10) and to the per-article verdict_inputs and contributing_events drill-down (shipped v0.26.0, 2026-05-21).
  • Formal Methods Meet LLMs: Auditing, Monitoring, and Intervention for Compliance of Advanced AI Systems. arXiv:2605.16198v1, published 2026-05-15. Proposes runtime monitors using Linear Temporal Logic for product-specific behavioural constraints, with intervening monitors that act at runtime to preempt predicted violations. Adjacent to Vaara's policy-driven runtime decisions and external-scorer composition (shipped v0.14.0, 2026-05-17).

Safety cases with runtime confidence updates

  • A Subjective Logic-based method for runtime confidence updates in safety arguments. arXiv:2605.22530v1, published 2026-05-21. Describes a method for continuously updating static safety cases using runtime Safety Performance Indicators, propagating confidence through a Subjective Logic assurance case. Adjacent to Vaara's evidence-sufficiency framework shipped in VERDICTS.md (v0.28.0, 2026-05-22) and to the conformal interval that ships with every Vaara risk score (v0.1.0, 2026-04-10).

Calibration and external validation

  • Calibration, Uncertainty Communication, and Deployment Readiness in CKD Risk Prediction: A Framework Evaluation Study. arXiv:2605.21566v1, published 2026-05-20. Trains five classifiers on the UCI CKD dataset (400 patients) and evaluates each across calibration quality, conformal prediction coverage, and an eight-criterion deployment readiness framework. Reports internal AUROC 1.00 collapsing to 0.48-0.58 on the MIMIC-IV external cohort, with split-conformal coverage falling from 0.80-0.98 internal to 0.21-0.25 against a 90% target. Domain incomparable to Vaara, but the methodological lesson (internal test is a ceiling, the external gap is visible only against a held-out generator) motivates Vaara's v0.36 cross-model held-out corpus (bench/vaara-bench-v0.36.md).

Selective inference on conformal prediction sets

  • Selecting Informative Conformal Prediction Sets with an Optimized FCR-Controlled Approach. arXiv:2605.22004v1, published 2026-05-21. Formalises selective inference on conformal prediction sets with finite-sample false coverage rate guarantees. Methodology pointer for Vaara's planned FPR-bounded three-stage combiner (rules-veto in the uncertain band), scheduled for v0.37+. Not yet implemented in Vaara.

Aviation learning-assurance

  • Mechanistic Interpretability for Learning Assurance of a Vision-Based Landing System. arXiv:2605.20607v1, published 2026-05-20. Applies mechanistic interpretability to an EASA learning-assurance scenario, including out-of-model-scope runtime monitoring against the operational design domain. Vaara does not currently target aviation directly, but EASA learning-assurance is in the same harmonisation surface as the AI Act Article 6(1) / Annex I product-safety route.

National security threat-modelling

  • Backchaining Loss of Control Mitigations from Mission-Specific Benchmarks in National Security. arXiv:2605.21095v1, published 2026-05-20. Methodology for national security deployers to back-chain affordance and permission constraints from use-case-specific benchmarks. Adjacent in motivation to Vaara's policy-driven decisions over agent tool calls, in a deployment context Vaara does not currently target.

Classical foundations

  • Conformal prediction. Vovk, Gammerman, Shafer. Algorithmic Learning in a Random World (Springer). Vaara implements split-conformal prediction with a distribution-free coverage guarantee, as documented in docs/formal_specification.md and explained in plain language in docs/conformal-prediction.md.
  • Linear Temporal Logic and runtime verification. Pnueli (1977), Bauer, Leucker, Schallhart (2011). Background for the runtime-monitor literature cited above.

What this document is not

This document is not a competitive matrix. It deliberately omits vendor comparisons, feature checklists against named peers, and any "first to ship" claims framed as authority rather than chronology. Inclusion of a paper here means the work is in the same lane and worth reading, not that it is positioned as inferior or superior to Vaara.

For vendor positioning relative to commercial peers, see the discussion in COMPLIANCE.md and the framework integrations under src/vaara/integrations/.

How to keep this current

When a tagged release adds a load-bearing concept (a new audit primitive, a new evidence shape, a new public surface, or a new formal property), add a row to the chronology table above with the version, date, and a path into the codebase or docs. When a relevant paper or peer specification appears in the wider literature, add it to the related-work section with a neutral one-paragraph summary and the publication date. The goal is a paper trail that a reader can verify against PyPI, the git tags, and the cited URLs.