Agent WorkPaper Proof Matrix

June 4, 2026 ยท View on GitHub

Use this page before an agent drives Excel, LibreOffice, Google Sheets, or a browser grid. Pick the smallest proof that writes an input, recalculates a dependent formula, reads the value back, and preserves enough state for another process to check the result.

If you only run one command, run the agent MCP evaluator:

npm exec --yes --package @bilig/workpaper@latest -- bilig-evaluate --door agent-mcp --json

Expected invariants:

{
  "schemaVersion": "bilig-evaluator.v1",
  "door": "agent-mcp",
  "verified": true,
  "evidence": {
    "editedCell": "Inputs!B3",
    "dependentCell": "Summary!B3",
    "before": 60000,
    "after": 96000,
    "afterRestore": 96000,
    "afterRestart": 96000,
    "persistedDocumentBytes": 1162,
    "checks": {
      "listedFileBackedTools": true,
      "listedResourcesAndPrompts": true,
      "dependentCellChanged": true,
      "persistedToDisk": true,
      "restartReadbackMatchesAfter": true
    }
  }
}

Proof Matrix

ProofCommand or assetExpected JSON fieldWhat it provesWhat it does not prove
WorkPaper servicenpm exec --yes --package @bilig/workpaper@latest -- bilig-evaluate --door workpaper-service --jsondoor: "workpaper-service", verified: trueNode can edit a WorkPaper input, recalculate a formula, export JSON, restore it, and verify readback.MCP discovery, private workbook compatibility, macros, pivots, charts, or Excel UI behavior.
Agent MCP evaluatornpm exec --yes --package @bilig/workpaper@latest -- bilig-evaluate --door agent-mcp --jsondoor: "agent-mcp", listedResourcesAndPrompts, restartReadbackMatchesAfterA coding agent or MCP client can discover workbook tools, write a cell, read a formula value, persist state, and restart from disk.Hosted auth, arbitrary client UX, or full workbook compatibility.
Provider-backed formula boundarynpm exec --yes --package @bilig/workpaper@latest -- bilig-evaluate --door agent-mcp --scenario provider-backed --jsonscenario: "provider-backed", provider-backed-adapter-missing, adapterBackedDiagnosticsClearedProvider formulas such as IMPORTRANGE fail closed until the host supplies an adapter, then verify readback.Live Google Sheets authorization or remote provider availability.
Workbook Compatibility Reportnpm exec --yes --package @bilig/xlsx-formula-recalc@latest -- bilig-evaluate --door workbook-compatibility --jsondoor: "workbook-compatibility", riskLevel, unsupportedFunctions, noCompatibilityScoreA saved .xlsx can be inspected for unsupported functions, external links, macros, pivots, volatile functions, stale caches, and risk reasons before an agent trusts it.Excel compatibility certification, macro execution, pivot refresh, or a defensible compatibility percentage.
Agent XLSX risk preflightpnpm --dir examples/headless-workpaper run agent:mcp-xlsx-risk-preflightschemaVersion: "bilig-agent-xlsx-risk-preflight.v1", analyze_workbook_risk, afterExpectedArr: 96000A local MCP client can inspect real XLSX risk, then edit an imported WorkPaper, read a dependent formula back, persist state, and export the WorkPaper JSON.Excel compatibility certification, desktop Excel UI behavior, or safe continuation when risk findings require Excel, Graph, LibreOffice, or oracle review.
XLSX cache detectornpm exec --yes --package @bilig/xlsx-formula-recalc@latest -- bilig-evaluate --door xlsx-cache --jsondoor: "xlsx-cache", verified: trueA saved .xlsx can be inspected for stale cached formula values without mutating the file.In-process business state or JSON WorkPaper persistence.
XLSX recalculationnpm exec --package @bilig/xlsx-formula-recalc@latest -- xlsx-recalc --demo --jsonrecalculationCompleted: trueAn XLSX file boundary can be edited, recalculated, exported, and reimported for readback.A full Excel clone, macro execution, charts, pivots, or desktop layout fidelity.
ExcelJS recalculationnpx --package @bilig/exceljs-formula-recalc exceljs-recalc --demo --jsoncommandSucceeded: true, recalculationCompleted: true, expectedValueMatched: trueAn existing ExcelJS workbook can get fresh formula readback after Node edits.ExcelJS styling/export behavior, desktop Excel parity, or every Excel formula.
MCP Inspectornpx -y @modelcontextprotocol/inspector@latest --cli npm exec --yes --package @bilig/workpaper@latest -- bilig-workpaper-mcp --method tools/listtool names such as read_workpaper_summary, set_workpaper_input_cellA neutral MCP client can inspect the packaged stdio server before a user adds it to an agent host.Private workbook persistence unless the file-backed config is used.
File-backed MCP servernpm exec --package @bilig/workpaper@latest -- bilig-workpaper-mcp --workpaper ./pricing.workpaper.json --init-demo-workpaper --writableset_cell_contents_and_readback, export_workpaper_document; analyze_workbook_risk when started with --from-xlsxA local agent can use a persistent WorkPaper JSON file and inspect imported XLSX risk indicators before trusting the WorkPaper.Hosted multi-user storage, secret management, or Excel compatibility certification.
Vercel AI SDK generateText()pnpm --dir examples/headless-workpaper run agent:ai-sdk-generate-textapiShape: "AI SDK generateText -> tool -> execute"AI SDK tools can return before/after/restore WorkPaper proof from a generateText() loop.Provider model quality or production prompt behavior.
Vercel AI SDK streamText()pnpm --dir examples/headless-workpaper run agent:ai-sdk-stream-textapiShape: "AI SDK streamText -> tool -> execute", streamChunkTypesStreaming tool calls can carry the same WorkPaper proof while the model streams final text.Browser UI streaming, telemetry retention, or non-deterministic provider output.
OpenAI Responses function callpnpm --dir examples/headless-workpaper run agent:openai-responsesfunction_call_output, verified: trueOpenAI tool calling can wrap WorkPaper readback as a structured function result.Hosted remote MCP app review or ChatGPT UI behavior.
OpenAI Agents SDK hosted MCPpnpm --dir examples/headless-workpaper run agent:openai-agents-sdk-hosted-mcpMCPServerStreamableHttp, set_cell_contents_and_readbackAn OpenAI Agents SDK agent can call the hosted Streamable HTTP MCP endpoint.Private writable workbook state. Use local stdio for that.
LangGraph ToolNodepnpm --dir examples/headless-workpaper run agent:langgraph-toolnodeSpreadsheetAgentProof, restoredMatchesAfterGraph state can carry the proof object instead of only a scalar formula value.Whether every graph architecture should store workbook state.
Microsoft Agent Framework MCP toolspython examples/microsoft-agent-framework-workpaper-mcp/scripts/check-microsoft-agent-framework-recipe.pyMCPStdioTool, MCPStreamableHTTPTool, set_cell_contents_and_readbackAgent Framework can host the same Bilig MCP boundary for local file-backed or hosted no-key workbook readback proof.LLM quality, private hosted storage, or exact import stability across every Agent Framework package version.
Semantic Kernel MCP pluginpython examples/semantic-kernel-workpaper-mcp/semantic_kernel_workpaper_mcp.py --smokeverified: true, pluginBoundaryA .NET-oriented agent path can keep WorkPaper behind a plugin or MCP boundary..NET package publication or C# parity for every TypeScript example.
Mastra toolpnpm --dir examples/mastra-workpaper-tool run smokeMastra createTool -> execute -> WorkPaper readbackThe repo-local Mastra example uses a real createTool() wrapper around formula readback.A new Mastra catalog submission, issue, or PR. Do not duplicate that outreach.

Selection Rules

Use agent-mcp first when the caller is an agent, MCP client, tool host, or integration reviewer. It proves discovery, write/readback, resources, prompts, and restart state in one command.

Use the Vercel AI SDK, OpenAI, LangGraph, Microsoft Agent Framework, Semantic Kernel, or Mastra examples only after the generic evaluator passes. Those examples prove host fit, not a stronger workbook runtime.

Use the XLSX and ExcelJS paths when a saved file or ExcelJS object is already the contract. Do not force a WorkPaper model when the job is mostly workbook formatting, image embedding, or file metadata.

Use the Agent XLSX risk preflight when an agent has a real .xlsx file and the next action would otherwise be UI automation. It keeps the workbook local, calls analyze_workbook_risk first, then requires set_cell_contents_and_readback and export_workpaper_document before the agent reports success.

If the reviewer asks what a successful agent session looks like, send the agent proof transcripts after this matrix. The transcripts show prompt, tool call, result, workbook state change, formula readback, JSON export, and restart readback verification for Codex, Claude Code, Copilot, Cursor, and Continue.

Limits

Bilig is not a desktop Excel replacement. Keep Excel, LibreOffice, Microsoft Graph, or a spreadsheet-specific oracle in the loop for macros, pivots, charts, external links, unsupported formulas, locale-specific Excel behavior, or exact manual UI workflows.