πŸ”’ Security Investigation Automation System

April 22, 2026 Β· View on GitHub

Comprehensive, automated security investigations powered by Microsoft Sentinel, Defender XDR, Graph API, and threat intelligence β€” with 25 specialized Agent Skills

πŸ“Ί Video Walkthrough: See this project in action β€” Watch on YouTube (starts at the Security Investigator demo). Covers the end-to-end workflow: natural language investigations, MCP server integration, KQL query execution, threat intelligence enrichment, and automated report generation.

An investigation automation framework that combines GitHub Copilot, VS Code Agent Skills, and Model Context Protocol (MCP) servers to enable natural language security investigations. Ask questions like "Investigate this user for the last 7 days" or "Is this IP malicious?" and get comprehensive analysis with KQL queries, threat intelligence correlation, and professional reports.

Quick Start (TL;DR)

# 1. Clone and open in VS Code
git clone https://github.com/SCStelz/security-investigator.git
code security-investigator

# 2. Set up Python environment
python -m venv .venv
.venv\Scripts\Activate.ps1          # Windows
# source .venv/bin/activate          # macOS/Linux
pip install --require-hashes -r requirements.lock   # Hash-verified (recommended)
# pip install -r requirements.txt                   # Without hash verification

# 3. Configure environment
copy config.json.template config.json
# Edit config.json β†’ add your Sentinel workspace ID, tenant ID
copy .env.template .env
# Edit .env β†’ add your API tokens (ipinfo, AbuseIPDB, vpnapi, Shodan)

# 4. Configure MCP servers
copy .vscode\mcp.json.template .vscode\mcp.json
# All platform servers are pre-configured β€” just needs a GitHub PAT on first use

# 5. Open Copilot Chat (Ctrl+Shift+I) in Agent mode and start with:
#    "Run a threat pulse scan"

πŸš€ Recommended first run: The Threat Pulse skill is the best starting point. It runs a broad-spectrum scan across 9 security domains (incidents, identity, endpoint, exposure, email, UEBA, auth spray, privileged ops, CVEs) and produces prioritized findings with color-coded verdicts (πŸ”΄ Escalate / 🟠 Investigate / 🟑 Monitor / βœ… Clear). Each finding includes a drill-down recommendation pointing to a specialized skill β€” so after the scan, you'll know exactly where to focus and which follow-up command to run.

Other example prompts:

"Investigate user@domain.com for the last 7 days"    β†’ user-investigation
"Analyze incident 12345"                              β†’ incident-investigation
"Is this IP malicious? 203.0.113.42"                  β†’ ioc-investigation
"What skills do you have access to?"                  β†’ lists all 25 skills

For detailed workflows and KQL queries: β†’ .github/copilot-instructions.md (universal patterns, skill detection) β†’ .github/skills/ (25 specialized investigation workflows) β†’ queries/ (verified KQL query library)


Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     GitHub Copilot (VS Code)                       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                  .github/copilot-instructions.md                   β”‚
β”‚            (Skill detection, universal patterns, routing)          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                     .github/skills/*.md                            β”‚
β”‚       (25 specialized workflows with KQL, risk assessment)         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                     MCP Servers (Platform)                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Sentinel    β”‚  β”‚ Graph API    β”‚  β”‚ Sentinel Triage (XDR)     β”‚  β”‚
β”‚  β”‚ Data Lake   β”‚  β”‚ (Identity)   β”‚  β”‚ (Advanced Hunting)        β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ KQL Search  β”‚  β”‚ Microsoft    β”‚  β”‚ Azure MCP Server          β”‚  β”‚
β”‚  β”‚ (Schema)    β”‚  β”‚ Learn (Docs) β”‚  β”‚ (ARM + Monitor)           β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                                   β”‚
β”‚  β”‚ Sentinel    β”‚                                                   β”‚
β”‚  β”‚ Graph (Rel) β”‚                                                   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                                   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚               MCP Apps (Local Custom Servers)                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Geomap      β”‚  β”‚ Heatmap      β”‚  β”‚ Incident Comment          β”‚  β”‚
β”‚  β”‚ (Attack Map)β”‚  β”‚ (Patterns)   β”‚  β”‚ (Sentinel Integration)    β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                      Python Utilities                              β”‚
β”‚ generate_report_from_json.py  β”‚  enrich_ips.py  β”‚  report_generatorβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Components:

  • 25 Agent Skills β€” Modular investigation workflows for incidents, users, devices, IoCs, authentication, scope drift (SPN/User/Device), MCP monitoring, exposure management, AI agent posture, app registration posture, identity posture, data security analysis, email threat posture, MITRE ATT&CK coverage, ingestion analysis, detection authoring, threat pulse scanning, SVG dashboards, and more
  • 7 MCP Server Integrations β€” Sentinel Data Lake, Graph API, Defender XDR Triage, KQL Search, Microsoft Learn, Azure MCP Server, Sentinel Graph (private preview)
  • 3 Local MCP Apps β€” Interactive heatmaps, geographic attack maps, incident commenting
  • Python Utilities β€” HTML report generation with IP enrichment (geolocation, VPN detection, abuse scores, Shodan port/service/CVE intelligence)

πŸ€– Agent Skills

This system uses VS Code Agent Skills to provide modular, domain-specific investigation workflows. Skills are automatically detected based on keywords in your prompts.

Available Skills (25)

CategorySkillDescriptionTrigger Keywords
⚑ Quick Scanthreat-pulseRapid broad-spectrum security scan across 7 domains: active incidents, identity (human + NHI), endpoint, email threats, admin & cloud ops, exposure. Prioritized Threat Pulse findings with color-coded verdicts and drill-down recommendations"threat pulse", "quick scan", "security pulse", "morning hunt", "what can you do", "where do I start", "what's going on"
πŸ” Core Investigationcomputer-investigationDevice security analysis for Entra Joined, Hybrid Joined, and Entra Registered devices: Defender alerts, compliance, logged-on users, vulnerabilities, process/network/file events"investigate computer", "investigate device", "investigate endpoint", "check machine", hostname
πŸ” Core Investigationhoneypot-investigationHoneypot security analysis: attack patterns, threat intel, vulnerabilities, executive reports"honeypot", "attack analysis", "threat actor"
πŸ” Core Investigationincident-investigationComprehensive incident analysis for Defender XDR and Sentinel incidents: criticality assessment, entity extraction, filtering, recursive entity investigation"investigate incident", "incident ID", "analyze incident", "triage incident", incident number
πŸ” Core Investigationioc-investigationIndicator of Compromise analysis: IP addresses, domains, URLs, file hashes. Includes Defender Threat Intelligence, Sentinel TI tables, CVE correlation, organizational exposure"investigate IP", "investigate domain", "investigate URL", "investigate hash", "IoC", "is this malicious"
πŸ” Core Investigationuser-investigationEntra ID user security analysis: sign-ins, anomalies, MFA, devices, audit logs, incidents, Identity Protection, HTML reports"investigate user", "security investigation", "check user activity", UPN/email
πŸ” Auth & Accessauthentication-tracingEntra ID authentication chain forensics: SessionId analysis, token reuse vs interactive MFA, geographic anomalies"trace authentication", "SessionId analysis", "token reuse", "geographic anomaly"
πŸ” Auth & Accessca-policy-investigationConditional Access policy forensics: sign-in failure correlation, policy state changes, security bypass detection"Conditional Access", "CA policy", "device compliance", "policy bypass"
πŸ“ˆ Behavioral Analysisscope-drift-detection/deviceDevice process drift: configurable-window baseline, 5-dimension Drift Score (Volume/Processes/Accounts/Chains/Signing), fleet-wide or single-device, Heartbeat uptime corroboration"device drift", "endpoint drift", "process baseline", "device behavioral change"
πŸ“ˆ Behavioral Analysisscope-drift-detection/spnSPN scope drift: 90-day baseline vs 7-day comparison, 5-dimension Drift Score, correlated with AuditLogs, SecurityAlert, DeviceNetworkEvents"scope drift", "service principal drift", "SPN behavioral change", "SPN drift"
πŸ“ˆ Behavioral Analysisscope-drift-detection/userUser scope drift: 90-day baseline vs 7-day comparison, dual Drift Scores (7-dim interactive + 6-dim non-interactive), correlated with AuditLogs, SecurityAlert, Identity Protection, CloudAppEvents, EmailEvents"user drift", "user scope drift", "user behavioral change", "UPN drift"
πŸ›‘οΈ Posture & Exposureexposure-investigationVulnerability & Exposure Management reporting: CVE assessment with exploit/CVSS data, security configuration compliance, end-of-support software, ExposureGraph critical assets, attack paths, Defender health, certificate status"vulnerability report", "exposure report", "CVE assessment", "security posture", "TVM"
πŸ›‘οΈ Posture & Exposureai-agent-postureAI agent security posture audit for Copilot Studio and M365 Copilot agents: agent inventory, authentication gaps, access control misconfigurations, MCP tool proliferation, knowledge source exposure, XPIA risk, credential detection, Agent Security Score"AI agent posture", "agent security audit", "Copilot Studio agents", "agent inventory", "unauthenticated agents", "agent sprawl"
πŸ›‘οΈ Posture & Exposureapp-registration-postureApp registration and service principal security posture: Graph API permission inventory (dangerous grants, permission concentration), app ownership risk, credential hygiene (stale secrets, multi-credential apps), cross-tenant SPN exposure, KQL attack chain detection (AuditLogs, AADServicePrincipalSignInLogs, MicrosoftGraphActivityLogs), App Permission Risk Score with 5 dimensions"app registration posture", "app registration abuse", "service principal permissions", "dangerous app permissions", "app ownership", "overprivileged apps"
πŸ›‘οΈ Posture & Exposureemail-threat-postureEmail threat protection posture report for Microsoft Defender for Office 365: inbound mail flow overview, threat composition (phishing/spam/malware), email authentication (DMARC/DKIM/SPF/CompAuth), ZAP post-delivery remediation, Safe Links click protection, attachment analysis, detection method breakdown, MDO security incidents, Email Protection Score with 5 dimensions. Inline chat, markdown file, and SVG dashboard output"email threat report", "email security posture", "phishing report", "MDO report", "Defender for Office 365 report", "ZAP effectiveness", "Safe Links report", "DMARC report"
πŸ›‘οΈ Posture & Exposureidentity-postureIdentity security posture report using IdentityAccountInfo (MDI/Advanced Hunting): multi-provider account inventory (Entra ID, AD, Okta, SailPoint, CyberArk, Ping), privileged account audit with role distribution, stale/disabled/deleted account hygiene, password posture, risk distribution, multi-provider identity linking, MDI tag analysis, Identity Posture Score with 5 dimensions. Inline chat and markdown file output"identity posture", "identity security report", "account hygiene", "stale accounts", "privileged accounts", "password posture", "identity providers", "honeytoken"
πŸ”’ Data Securitydata-security-analysisDataSecurityEvents (Purview/IRM) analysis: SIT access breakdowns, user risk ranking, file inventory, DLP policy correlation, Copilot SIT exposure, SIT GUID-to-name resolution, anomaly detection. Designed for 100k+ user environments"data security", "sensitive information type", "SIT access", "DLP events", "DataSecurityEvents", "EDM access", "insider risk activity", "Purview data security"
πŸ“Š Visualizationgeomap-visualizationInteractive world map visualization for Sentinel data: attack origin maps, geographic threat distribution, IP geolocation with enrichment drill-down"geomap", "world map", "geographic", "attack map", "attack origins"
πŸ“Š Visualizationheatmap-visualizationInteractive heatmap visualization for Sentinel data: attack patterns by time, activity grids, IP vs hour matrices, threat intel drill-down"heatmap", "show heatmap", "visualize patterns", "activity grid"
πŸ“Š Visualizationsvg-dashboardSVG data visualization dashboards: dual-mode renderer supporting manifest-driven structured dashboards (from skill reports) and freeform adaptive visualizations from ad-hoc investigation data. 14-widget component library"generate SVG dashboard", "create a visual dashboard", "visualize this report", "SVG from this data"
πŸ”§ Tooling & Monitoringdetection-authoringCreate, deploy, update, and manage Defender XDR custom detection rules via Graph API. Query adaptation from Sentinel KQL, manifest-driven batch deployment via PowerShell, lifecycle management"create custom detection", "deploy detection", "detection rule", "custom detection", "deploy rule", "batch deploy"
πŸ”§ Tooling & Monitoringkql-query-authoringKQL query creation using schema validation, community examples, Microsoft Learn"write KQL", "create KQL query", "help with KQL", "query [table]"
πŸ”§ Tooling & Monitoringmcp-usage-monitoringMCP server usage monitoring and audit: Graph MCP endpoint analysis, Sentinel MCP auth events, Azure MCP ARM operations, workspace query governance, MCP Usage Score with 5 health/risk dimensions"MCP usage", "MCP server monitoring", "MCP activity", "MCP audit", "Graph MCP", "Sentinel MCP", "Azure MCP"
πŸ”§ Tooling & Monitoringsentinel-ingestion-reportSentinel workspace ingestion & cost analysis: table-level volume breakdown, tier classification (Analytics/Basic/Data Lake), SecurityEvent/Syslog/CommonSecurityLog deep dives, ingestion anomaly detection, analytic rule inventory via REST API, custom detection inventory via Graph API, rule health via SentinelHealth, data lake tier migration candidates, license benefit analysis (DfS P2, M365 E5)"ingestion report", "usage report", "data volume", "cost analysis", "table breakdown", "data lake tier", "ingestion anomaly", "cost optimization"
πŸ”§ Tooling & Monitoringmitre-coverage-reportMITRE ATT&CK coverage analysis: YAML-driven PowerShell pipeline gathers analytic rule MITRE tags, custom detection techniques, SOC Optimization recommendations, alert/incident operational data. Tactic-level coverage matrix, technique-level drill-down with rule mapping, coverage gap identification, SOC Optimization threat scenario alignment, untagged rule remediation, MITRE Coverage Score (5 weighted dimensions). Inline chat and markdown file output"MITRE coverage", "ATT&CK coverage", "MITRE report", "tactic coverage", "technique coverage", "coverage gaps", "MITRE score", "detection coverage report", "MITRE matrix"

How Skills Work

  1. You ask Copilot a question (e.g., "Investigate user@domain.com for the last 7 days")
  2. Copilot detects keywords and loads the appropriate skill from .github/skills/<skill-name>/SKILL.md
  3. The skill provides specialized workflow, KQL queries, and risk assessment criteria
  4. Universal patterns from .github/copilot-instructions.md are inherited automatically

Triggering Skills with Natural Language

You don't need to mention the skill name β€” keywords are detected automatically:

What you saySkill triggered
"Investigate user@domain.com for the last 7 days"user-investigation
"Analyze incident 12345"incident-investigation
"Is this IP malicious? 203.0.113.42"ioc-investigation
"Check the device WORKSTATION-01 for threats"computer-investigation
"Show attack patterns on a heatmap"heatmap-visualization
"Generate an SVG dashboard from the report"svg-dashboard
"Map the geographic origins of these attacks"geomap-visualization
"Write a KQL query to find failed sign-ins"kql-query-authoring
"Trace this authentication back to the original MFA"authentication-tracing
"Detect scope drift in service principals"scope-drift-detection/spn
"Check user behavioral drift for user@domain.com"scope-drift-detection/user
"Analyze device process drift across the fleet"scope-drift-detection/device
"Show me MCP server usage for the last 30 days"mcp-usage-monitoring
"Generate a Sentinel ingestion report"sentinel-ingestion-report
"Create custom detections for Event ID 4799"detection-authoring
"Audit AI agent security posture"ai-agent-posture
"Who accessed files with credit card numbers?"data-security-analysis
"Generate an email threat protection report"email-threat-posture
"Run an identity posture report"identity-posture
"Generate a MITRE ATT&CK coverage report"mitre-coverage-report
"Run a threat pulse scan"threat-pulse
"Audit our app registration security posture"app-registration-posture

Follow-ups and Chaining

After running an investigation, ask follow-up questions without re-running the entire workflow:

Is that IP a VPN?
Trace authentication for that suspicious location
Was MFA used for those sign-ins?

Skills can be chained for comprehensive analysis:

1. "Investigate incident 12345" β†’ incident-investigation extracts entities
2. "Now investigate the user from that incident" β†’ user-investigation runs on extracted UPN
3. "Check if that IP is malicious" β†’ ioc-investigation analyzes the suspicious IP
4. "Show me a heatmap of the attack patterns" β†’ heatmap-visualization

Copilot uses existing investigation data from temp/investigation_*.json when available.

Discovering Skills

What investigation skills do you have access to?
Explain the high-level workflow of the user-investigation skill
What data sources does the ioc-investigation skill use?

πŸ“– Reference: GitHub Agent Skills Documentation

Authoring New Skills & Queries from Investigations

Ad-hoc investigations naturally evolve into reusable assets. After completing an investigation, ask Copilot to package the verified queries, schema pitfalls, and analytical logic into a new SKILL.md or query file.

"Based on the investigation we just completed, create a new reusable skill"
"Read this threat intel article: <URL> β€” extract TTPs and IOCs, then write, test, and tune a queries file for reusable threat hunts"

πŸ“ Project Structure

security-investigator/
β”œβ”€β”€ enrich_ips.py                # Standalone IP enrichment utility
β”œβ”€β”€ config.json                  # Configuration (workspace IDs, mappings)
β”œβ”€β”€ config.json.template         # Config template (committed to Git)
β”œβ”€β”€ .env                         # API tokens (gitignored, auto-loaded by python-dotenv)
β”œβ”€β”€ .env.template                # Token template (committed to Git)
β”œβ”€β”€ requirements.txt             # Python dependencies
β”œβ”€β”€ requirements.lock            # Hash-verified dependency lockfile
β”œβ”€β”€ .vscode/
β”‚   └── mcp.json.template       # MCP server config template (copy to mcp.json)
β”œβ”€β”€ .github/
β”‚   β”œβ”€β”€ copilot-instructions.md  # Skill detection, universal patterns, routing
β”‚   β”œβ”€β”€ manifests/               # Auto-generated discovery indexes
β”‚   β”‚   β”œβ”€β”€ discovery-manifest.yaml  # Query file + skill index (domains, MITRE, prompts)
β”‚   β”‚   └── build_manifest.py        # Manifest generator script
β”‚   └── skills/                  # 25 Agent Skills (modular investigation workflows)
β”‚       β”œβ”€β”€ ai-agent-posture/
β”‚       β”œβ”€β”€ app-registration-posture/
β”‚       β”œβ”€β”€ authentication-tracing/
β”‚       β”œβ”€β”€ ca-policy-investigation/
β”‚       β”œβ”€β”€ computer-investigation/
β”‚       β”œβ”€β”€ data-security-analysis/
β”‚       β”œβ”€β”€ detection-authoring/
β”‚       β”œβ”€β”€ email-threat-posture/
β”‚       β”œβ”€β”€ exposure-investigation/
β”‚       β”œβ”€β”€ geomap-visualization/
β”‚       β”œβ”€β”€ heatmap-visualization/
β”‚       β”œβ”€β”€ honeypot-investigation/
β”‚       β”œβ”€β”€ identity-posture/
β”‚       β”œβ”€β”€ incident-investigation/
β”‚       β”œβ”€β”€ ioc-investigation/
β”‚       β”œβ”€β”€ kql-query-authoring/
β”‚       β”œβ”€β”€ mcp-usage-monitoring/
β”‚       β”œβ”€β”€ mitre-coverage-report/
β”‚       β”œβ”€β”€ scope-drift-detection/
β”‚       β”‚   β”œβ”€β”€ spn/              # Service principal drift (5 dimensions)
β”‚       β”‚   β”œβ”€β”€ user/             # User account drift (7+6 dimensions)
β”‚       β”‚   └── device/           # Device process drift (5 dimensions)
β”‚       β”œβ”€β”€ sentinel-ingestion-report/
β”‚       β”œβ”€β”€ svg-dashboard/
β”‚       β”œβ”€β”€ threat-pulse/
β”‚       └── user-investigation/
β”œβ”€β”€ queries/                     # Verified KQL query library (grep-searchable, by data domain)
β”‚   β”œβ”€β”€ cloud/                  # Cloud app & exposure management queries
β”‚   β”œβ”€β”€ email/                  # Defender for Office 365 email queries
β”‚   β”œβ”€β”€ endpoint/               # Defender for Endpoint device queries
β”‚   β”œβ”€β”€ identity/               # Entra ID / Azure AD identity queries
β”‚   β”œβ”€β”€ incidents/              # SecurityIncident & SecurityAlert queries
β”‚   └── network/                # Network telemetry queries
β”œβ”€β”€ scripts/                     # Python utilities
β”‚   β”œβ”€β”€ generate_report_from_json.py  # Report generator (main entry point)
β”‚   β”œβ”€β”€ report_generator.py           # HTML report builder class
β”‚   β”œβ”€β”€ investigator.py               # Data models and core types
β”‚   β”œβ”€β”€ cleanup_old_investigations.py  # Automated cleanup (3+ days old)
β”‚   └── generate_tocs.py              # Auto-generate query file TOCs
β”œβ”€β”€ mcp-apps/                    # Local MCP servers (visualization, automation)
β”‚   β”œβ”€β”€ sentinel-geomap-server/
β”‚   β”œβ”€β”€ sentinel-heatmap-server/
β”‚   └── sentinel-incident-comment/
β”œβ”€β”€ docs/                        # Setup guides and reference documentation
β”œβ”€β”€ authoring/                   # Blog drafts, writing guides, and marketing content
β”œβ”€β”€ reports/                     # Generated investigation reports (organized by type)
β”‚   β”œβ”€β”€ ai-agent-posture/       # AI agent security posture reports
β”‚   β”œβ”€β”€ app-registration-posture/ # App registration posture reports
β”‚   β”œβ”€β”€ computer-investigations/ # Device security investigation reports
β”‚   β”œβ”€β”€ data-security/          # Data security SIT analysis reports
β”‚   β”œβ”€β”€ email-threat-posture/   # Email threat protection posture reports
β”‚   β”œβ”€β”€ exposure/               # Exposure management reports
β”‚   β”œβ”€β”€ honeypot/               # Honeypot executive reports
β”‚   β”œβ”€β”€ identity-posture/       # Identity security posture reports
β”‚   β”œβ”€β”€ mcp-usage/              # MCP usage monitoring reports
β”‚   β”œβ”€β”€ scope-drift/            # Scope drift analysis reports
β”‚   β”œβ”€β”€ sentinel/               # Sentinel ingestion & cost analysis reports
β”‚   β”œβ”€β”€ threat-pulse/           # Threat Pulse scan reports
β”‚   └── user-investigations/    # HTML user investigation reports
β”œβ”€β”€ temp/                        # Investigation JSON files (auto-cleaned after 3 days)
└── archive/                     # Legacy code and design docs

Query Library (queries/)

The queries/ folder contains verified, battle-tested KQL query collections organized by detection scenario. These are the Priority 2 lookup source in the KQL Pre-Flight Checklist β€” Copilot searches them before writing any ad-hoc KQL.

Each file uses a standardized metadata header for efficient grep_search discovery:

# <Title>
**Tables:** <exact KQL table names>
**Keywords:** <searchable terms β€” attack techniques, scenarios, field names>
**MITRE:** <ATT&CK technique IDs, e.g., T1021.001, TA0008>
**Domains:** <domain tags for manifest indexing, e.g., identity, endpoint, email>

Discovery Manifest (.github/manifests/)

The discovery manifest provides a machine-readable index of all query files and skills, enabling deterministic cross-referencing by domain and MITRE technique. The Threat Pulse skill loads this manifest to match findings to downstream query files and drill-down skills automatically.

  • discovery-manifest.yaml β€” Compact index (~500 lines) with title, path, domains, mitre, and prompt fields for each query file and skill
  • build_manifest.py β€” Generator script that scans queries/ metadata headers and skill YAML frontmatter to produce the manifest

How it works:

  1. Query files declare **Domains:** tags in their metadata header (valid tags: incidents, identity, spn, endpoint, email, admin, cloud, exposure)
  2. Skills declare threat_pulse_domains: and drill_down_prompt: in their YAML frontmatter
  3. python .github/manifests/build_manifest.py scans both and emits the manifest
  4. The Threat Pulse skill reads the manifest to match non-βœ… findings β†’ relevant query files and skills by domain tag and MITRE technique overlap

Regenerate after creating or renaming query files/skills, or changing Domains:/threat_pulse_domains: values:

python .github/manifests/build_manifest.py

πŸš€ Setup

Prerequisites

RequirementDetails
VS CodeVersion 1.99+ recommended (Agent mode + MCP support).
GitHub CopilotActive subscription β€” Copilot Pro+, Business, or Enterprise. Agent mode must be enabled.
Python 3.8+For IP enrichment utility and report generation. Download
Azure CLIRequired for Azure MCP Server (underlying auth) and sentinel-ingestion-report skill (az monitor log-analytics query for all KQL queries, az rest for analytic rule inventory, az monitor log-analytics workspace table list for tier classification). Install. Authenticate: az login --tenant <tenant_id>, then az account set --subscription <subscription_id>. Requires Log Analytics Reader (KQL queries + table list) and Microsoft Sentinel Reader (analytic rule inventory) on the workspace.
log-analytics CLI extensionRequired by the sentinel-ingestion-report skill for az monitor log-analytics query (all KQL queries in Phases 1-5). Install: az extension add --name log-analytics. Verify: az extension list --query "[?name=='log-analytics']".
PowerShell 7.0+Required for sentinel-ingestion-report skill (parallel query execution via ForEach-Object -Parallel). Install. Verify: $PSVersionTable.PSVersion.
Node.js 18+Required for KQL Search MCP (npx) and building local MCP Apps. Download or install via winget install OpenJS.NodeJS.LTS (Windows) / brew install node (macOS).
Microsoft SentinelLog Analytics workspace with data. You'll need the workspace GUID and tenant ID.
Entra ID PermissionsIf you can query Sentinel in the Azure Portal, you likely have sufficient access. The Graph MCP server requires a one-time tenant provisioning by an admin. See MCP Server Setup for detailed per-server requirements.
Microsoft.Graph PowerShellRequired for detection-authoring skill (CustomDetection.ReadWrite.All β€” create/update/delete custom detection rules via Graph API). Also used by sentinel-ingestion-report skill for rule inventory (CustomDetection.Read.All β€” read-only, degrades gracefully if not installed). Install-Module Microsoft.Graph.Authentication -Scope CurrentUser.
GitHub PATpublic_repo scope β€” Create one here. Used by KQL Search MCP.

1. Install Dependencies

Verify prerequisites:

python --version   # Requires 3.8+
node --version     # Requires 18+ (needed for KQL Search MCP)
az --version       # Azure CLI (needed for Azure MCP Server, ingestion report skill)
pwsh --version     # Requires 7.0+ (needed for sentinel-ingestion-report skill)

If Node.js is missing: Download or run winget install OpenJS.NodeJS.LTS (Windows) / brew install node (macOS). If Azure CLI is missing: Install, then az login --tenant <tenant_id> and az account set --subscription <subscription_id>. If the log-analytics extension is missing: az extension add --name log-analytics (required for sentinel-ingestion-report skill).

Set up Python environment:

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt

2. Configure Environment

Copy config.json.template to config.json and fill in your workspace details:

{
  "sentinel_workspace_id": "YOUR_WORKSPACE_ID_HERE",
  "tenant_id": "YOUR_TENANT_ID_HERE",
  "subscription_id": "YOUR_SUBSCRIPTION_ID_HERE",
  "azure_mcp": {
    "resource_group": "YOUR_LOG_ANALYTICS_RESOURCE_GROUP",
    "workspace_name": "YOUR_LOG_ANALYTICS_WORKSPACE_NAME",
    "tenant": "YOUR_TENANT_ID_HERE",
    "subscription": "YOUR_SUBSCRIPTION_ID_HERE"
  },
  "output_dir": "reports"
}
SettingRequiredDescription
sentinel_workspace_idYesMicrosoft Sentinel (Log Analytics) workspace GUID
tenant_idYesEntra ID (Azure AD) tenant ID for your Sentinel workspace
subscription_idYesAzure subscription ID containing the Sentinel workspace
azure_mcp.*YesAzure MCP Server parameters β€” resource group, workspace name, tenant, subscription. Required to avoid cross-tenant auth errors.
output_dirNoDirectory for HTML reports (default: reports)

API Tokens (.env file)

API tokens for IP enrichment are stored in a .env file (gitignored) rather than config.json for security. Copy the template and add your keys:

copy .env.template .env
# Edit .env with your token values
IPINFO_TOKEN=your_token_here
ABUSEIPDB_TOKEN=your_token_here
VPNAPI_TOKEN=your_token_here
SHODAN_TOKEN=your_token_here

These are auto-loaded by enrich_ips.py via python-dotenv β€” no manual sourcing needed.

TokenRequiredDescription
IPINFO_TOKENRecommendedipinfo.io API token β€” geolocation, ASN, org. Free: 1K/day; token: 50K/month; paid plans include VPN detection
ABUSEIPDB_TOKENRecommendedAbuseIPDB API token β€” IP reputation scoring (0-100 confidence). Free: 1K/day
VPNAPI_TOKENOptionalvpnapi.io API token β€” VPN/proxy/Tor detection. Not needed if ipinfo.io is on a paid plan
SHODAN_TOKENOptionalShodan API key β€” open ports, services, CVEs, OS detection, tags. Free InternetDB fallback if no key or credits exhausted

3. Configure MCP Servers

Copy the MCP server template (all platform servers + 3 optional MCP Apps are pre-configured):

copy .vscode/mcp.json.template .vscode/mcp.json

The template includes inline documentation for each server. On first use, VS Code will prompt for:

  • Entra ID login β€” browser-based auth for Sentinel Data Lake, Graph, Triage, and Sentinel Graph servers
  • GitHub PAT β€” for KQL Search MCP (schema intelligence and query discovery). Needs public_repo scope.

See MCP Server Setup below for per-server permissions and installation guides.

4. Build MCP Apps (Optional β€” Visualization Skills)

PowerShell (Windows):

cd mcp-apps/sentinel-geomap-server; npm install; npm run build; cd ../..
cd mcp-apps/sentinel-heatmap-server; npm install; npm run build; cd ../..
cd mcp-apps/sentinel-incident-comment; npm install; npm run build; cd ../..

Bash (macOS/Linux):

cd mcp-apps/sentinel-geomap-server && npm install && npm run build && cd ../..
cd mcp-apps/sentinel-heatmap-server && npm install && npm run build && cd ../..
cd mcp-apps/sentinel-incident-comment && npm install && npm run build && cd ../..

The sentinel-incident-comment MCP App requires an Azure Logic App backend. See mcp-apps/sentinel-incident-comment/README.md for setup. Based on stefanpems/mcp-add-comment-to-sentinel-incident.


πŸ”Œ MCP Server Setup

The system uses several Model Context Protocol (MCP) servers. All are pre-configured in .vscode/mcp.json.template β€” copy it to .vscode/mcp.json to get started (see Step 3 above). The sections below document permissions, tools, and installation guides for each server.

At a Glance

#ServerMCP URL / TransportSetup GuideKey Permissions
1Sentinel Data Lakehttps://sentinel.microsoft.com/mcp/data-explorationSetupLog Analytics Reader
2Microsoft Graphhttps://mcp.svc.cloud.microsoft/enterpriseSetupUser.Read.All, Device.Read.All
3Sentinel Triagehttps://sentinel.microsoft.com/mcp/triageSetupSecurityReader
4KQL Searchnpx -y kql-search-mcp (stdio)SetupGitHub PAT (public_repo)
5Microsoft Learnhttps://learn.microsoft.com/api/mcpSetupNone (free)
6Azure MCP ServerVS Code extension (stdio)SetupContributor or Reader on subscription
7Sentinel Graph ⚠️https://sentinel.microsoft.com/mcp/graphBlogSentinel Reader β€” Private Preview

1. Microsoft Sentinel MCP Server

πŸ“– Installation Guide

Tools: query_lake, search_tables, list_sentinel_workspaces

Permissions:

  • Log Analytics Reader (minimum) β€” query workspace data
  • Sentinel Reader (recommended) β€” full investigation capabilities
  • Sentinel Contributor β€” watchlist management (optional)

2. MCP Server for Microsoft Graph

πŸ“– Installation Guide

Tools: microsoft_graph_suggest_queries, microsoft_graph_get, microsoft_graph_list_properties

⚑ One-time tenant provisioning (requires Application Administrator or Cloud Application Administrator role):

# 1. Install the Entra Beta PowerShell module (v1.0.13+)
Install-Module Microsoft.Entra.Beta -Force -AllowClobber

# 2. Authenticate to your tenant
Connect-Entra -Scopes 'Application.ReadWrite.All', 'Directory.Read.All', 'DelegatedPermissionGrant.ReadWrite.All'

# 3. Register the MCP Server and grant permissions to VS Code
Grant-EntraBetaMCPServerPermission -ApplicationName VisualStudioCode

This only needs to be done once per tenant. After provisioning, all users in the tenant can use the Graph MCP server by signing in with their own account.

Permissions (delegated, per-user):

  • User.Read.All β€” user profiles and authentication methods
  • UserAuthenticationMethod.Read.All β€” MFA methods
  • Device.Read.All β€” device compliance and enrollment
  • IdentityRiskEvent.Read.All β€” Identity Protection risk detections

3. Microsoft Sentinel Triage MCP Server

πŸ“– Installation Guide

Tools (30+): RunAdvancedHuntingQuery, ListIncidents, GetAlertById, GetDefenderMachine, GetDefenderFileInfo, GetDefenderIpAlerts, ListUserRelatedMachines, GetDefenderMachineVulnerabilities, and more.

Permissions:

  • Microsoft Defender for Endpoint API β€” SecurityReader role minimum
  • Advanced Hunting β€” read access to Defender XDR data

4. KQL Search MCP Server

πŸ“– Installation Guide

Option A: VS Code Extension (Recommended)

  1. Extensions panel β†’ Search "KQL Search MCP" β†’ Install
  2. Command Palette β†’ KQL Search MCP: Set GitHub Token

Option B: NPX β€” already configured in .vscode/mcp.json.template. Just needs a GitHub PAT with public_repo scope (prompted on first use).

Tools (34): Schema intelligence, query validation, GitHub search, ASIM support for 331+ tables.

5. Microsoft Learn MCP Server

πŸ“– Installation Guide

One-click: Install in VS Code β€” or already configured in .vscode/mcp.json.template.

Tools: microsoft_docs_search, microsoft_docs_fetch, microsoft_code_sample_search

No API key required β€” free, cloud-hosted by Microsoft.

6. Azure MCP Server

πŸ“– Installation Guide

Install via VS Code extension: search "Azure MCP Server" in Extensions, or install from the Marketplace. The extension registers as a stdio MCP server automatically.

Tools: monitor_workspace_log_query, monitor_activitylog_list, group_list, subscription_list, and 40+ namespaces covering AI, identity, security, databases, storage, compute, and networking.

Permissions:

  • Reader (minimum) β€” read-only access to Azure resources
  • Log Analytics Reader β€” for workspace_log_query (KQL against Log Analytics)
  • Contributor β€” for write/modify operations (optional)

Configuration: Requires azure_mcp parameters in config.json (tenant, subscription, resource group, workspace name) to avoid cross-tenant auth errors. See Configure Environment.

7. Sentinel Graph MCP Server ⚠️ Private Preview

Note: Sentinel Graph is currently in private preview and not available to all customers. If your tenant does not have access, this server will fail to connect β€” you can safely remove it from .vscode/mcp.json. See the announcement blog post for details and enrollment.

Tools: Entity graph exploration and relationship queries.

Permissions:

  • Sentinel Reader (minimum)

Pre-configured in .vscode/mcp.json.template. Browser-based Entra ID login on first use.

Verify Setup

Open Copilot Chat (Ctrl+Shift+I) in Agent mode and try these prompts:

TestPrompt to type in Copilot Chat
Sentinel Data LakeList my Sentinel workspaces
Microsoft GraphLook up my user profile in Graph
Sentinel TriageList recent security incidents
KQL SearchWhat columns does the SigninLogs table have?
Microsoft LearnSearch Microsoft docs for KQL query language
All skillsWhat investigation skills do you have access to?

If any server fails, check the MCP Servers panel in VS Code (click the {} icon in the bottom status bar) to verify each server shows a green connected status.


βš™οΈ Configuration Details

API Rate Limits (IP Enrichment)

ProviderFree TierWith Token
ipinfo.io1,000/day (geo, org, ASN)50,000/month; paid plans include VPN detection
AbuseIPDB1,000/day10,000/day ($20/month)
vpnapi.io1,000/month10,000/month ($9.99/month)
ShodanInternetDB (unlimited, ports/vulns/tags)$49 one-time membership: 100 queries/month (adds services, banners, SSL, OS)

Token priority: If ipinfo_token is a paid plan, VPN detection is included and vpnapi_token is optional. Shodan uses the full API when a paid key is available; on 403/429 it automatically falls back to the free InternetDB.

IP enrichment happens during report generation (not data collection), so you can re-generate reports without re-querying Sentinel/Graph.

Dependencies

pip install -r requirements.txt

Core packages: requests (HTTP client for enrichment APIs), python-dateutil (date parsing for KQL time ranges).


πŸ”’ Security Considerations

  1. Confidential Data β€” Reports contain PII and sensitive security data. Mark as CONFIDENTIAL and follow organizational data classification policies.
  2. Access Control β€” Restrict access to authorized SOC personnel. Use Azure RBAC for Sentinel, PIM for Graph API permissions.
  3. Audit Trail β€” All investigations are timestamped. JSON files in temp/ preserve snapshots; HTML reports include generation metadata.
  4. Data Retention β€” Investigations older than 3 days are auto-deleted (configurable). Archive important investigations before cleanup.
  5. API Token Security β€” Never commit config.json with tokens (already in .gitignore). Use environment variables or Azure Key Vault for production.
  6. Investigation JSON Files β€” Stored in temp/ (not committed to Git). Contain complete data including IP enrichment. Can be re-analyzed without re-querying.

πŸ› οΈ Troubleshooting

IssueSolution
"No anomalies found"Signinlogs_Anomalies_KQL_CL table doesn't exist or has no data. See user-investigation skill docs. Wait 24h for initial population.
"IP enrichment failed"ipinfo.io rate limits (1K/day free). Add token to config.json for 50K/month.
"MCP server not available"Check VS Code MCP server config. Verify authentication tokens are valid.
"User ID not found" (Graph)Verify UPN is correct. Check Graph permissions: User.Read.All.
"Sentinel query timeout"Reduce date range. Add | take 10 to limit results.
Report generation failsValidate JSON: python -m json.tool temp/investigation_*.json. Check required fields.
SecurityIncident returns 0 resultsUse BOTH targetUPN and targetUserId (Object ID). Some incidents use Object ID.
Risky sign-ins 404Must use /beta endpoint, not /v1.0.

Verify Connectivity

In Copilot Chat (Agent mode):

  • "List my Sentinel workspaces" β€” verifies Sentinel Data Lake MCP
  • "Look up user@domain.com in Graph" β€” verifies Graph MCP
  • "List recent incidents" β€” verifies Sentinel Triage MCP

In terminal:

python enrich_ips.py 8.8.8.8    # Verifies IP enrichment API tokens

🧠 (Optional) Persistent Tenant Context

GitHub Copilot Chat in VS Code provides agents with a memory tool β€” a built-in filesystem (/memories/) for persisting notes across conversations. Copilot already uses this internally; you can extend it with tenant-specific context (known infrastructure IPs, validated personnel, false-positive patterns, lab automation signatures) so investigations don't repeatedly mis-classify documented activity as πŸ”΄ critical.

Two memory tiers are relevant:

TierPathAuto-loaded?Use for
User memory/memories/*.mdβœ… Yes (~200 lines)Short trigger rules ("when you see tenant X, read repo file Y")
Repo memory/memories/repo/*.md❌ Filenames onlyRich tenant context (IPs, personnel, FP patterns) β€” pulled in by trigger rules

The memory tool is an internal agent capability β€” VS Code does not publish a dedicated docs page for it. Closest related concepts are custom instructions and Agent Skills, which serve different purposes (always-applied conventions and specialized workflows, respectively).

This workspace ships with:

  • Templates in notes/memory/examples/ β€” copy and adapt for your tenant (one user-tier example, two repo-tier examples)
  • Sync script scripts/sync-repo-memory.ps1 β€” backs up workspace-scoped (repo) memory from VS Code AppData into the workspace folder, surviving VS Code reinstall and workspace rename. Any cloud sync attached to your workspace (OneDrive, Dropbox, iCloud, etc.) then mirrors the backup across machines. Defaults to one-way export (ToBackup); restore mode (FromBackup) requires -Force because it writes into Copilot's trusted memory store.
  • Setup guide notes/memory/README.md β€” full walkthrough, sync usage, security model, and the trigger-rule pattern that makes Copilot actually consult repo memory

Quickstart: Open a template from notes/memory/examples/, then ask Copilot in chat to "create this as a memory file at /memories/..., replacing placeholders with my tenant values." Copilot uses its memory tool to write it directly β€” no AppData path navigation needed.

⚠️ Memory = trusted input. Anything in notes/memory/repo/ becomes authoritative instructions for Copilot in every future chat (with MCP tool access to Sentinel, Graph, Azure). Review diffs from forks/PRs before restoring, never paste secrets, and if your workspace is cloud-synced, confirm the destination is acceptable for security context. See notes/memory/README.md for the full threat model.


πŸ“„ License

This project is licensed under the MIT License. Use it, fork it, adapt it for your SOC β€” just keep the copyright notice.


πŸ™ Acknowledgments

Microsoft Security Platform

MCP Servers

Threat Intelligence APIs

  • ipinfo.io β€” IP geolocation, ISP/ASN identification, hosting provider detection
  • vpnapi.io β€” VPN, proxy, Tor exit node, and relay detection
  • AbuseIPDB β€” Community-sourced IP abuse scoring and recent attack reports
  • Shodan β€” Open port enumeration, service/banner detection, CVE identification, infrastructure tagging

Development Tools

Special thanks to the Microsoft Security community for sharing KQL queries and detection logic, and to stefanpems for the Sentinel incident commenting MCP pattern.