GitHub Copilot - Security Investigation Integration
May 22, 2026 Β· View on GitHub
This workspace contains a security investigation automation system. GitHub Copilot can help you run investigations using natural language.
π TABLE OF CONTENTS
- Critical Workflow Rules - Start here!
- Environment Configuration - Read
config.jsonfor workspace/tenant details - KQL Pre-Flight Checklist - Mandatory before EVERY query
- Evidence-Based Analysis - Anti-hallucination guardrails
- Remediation Output Policy - Portal links only, no executable commands
- Available Skills - Specialized investigation workflows
- Ad-Hoc Queries - Quick reference patterns
- Troubleshooting - Common issues and solutions
β οΈ CRITICAL WORKFLOW RULES - READ FIRST β οΈ
π€ SKILL DETECTION: Before starting any investigation, check the Available Skills section below and load the appropriate SKILL.md file.
π§ ENVIRONMENT CONFIGURATION
Environment-specific values (workspace IDs, tenant IDs, resource group names) are stored in config.json at the workspace root. This file is gitignored and never committed.
When you need environment values (especially for Azure MCP Server calls), read config.json instead of asking the user or hardcoding values.
config.json Schema (see config.json.template for field names):
| Field | Used By | Purpose |
|---|---|---|
sentinel_workspace_id | Sentinel Data Lake MCP (query_lake) | Log Analytics workspace GUID |
tenant_id | All Azure/Sentinel tools | Entra ID tenant |
subscription_id | Azure MCP Server, Azure CLI | Azure subscription |
azure_mcp.resource_group | Azure MCP workspace_log_query | RG containing Log Analytics workspace |
azure_mcp.workspace_name | Azure MCP workspace_log_query | Log Analytics workspace display name |
azure_mcp.tenant | Azure MCP Server (all calls) | Required to avoid cross-tenant auth errors |
azure_mcp.subscription | Azure MCP Server (all calls) | Target subscription |
API Tokens (.env file): Enrichment API tokens are stored in .env (gitignored), loaded via python-dotenv. Copy .env.template to .env and fill in your keys.
| Environment Variable | Used By | Purpose |
|---|---|---|
IPINFO_TOKEN | enrich_ips.py | ipinfo.io API key |
ABUSEIPDB_TOKEN | enrich_ips.py | AbuseIPDB API key |
VPNAPI_TOKEN | enrich_ips.py | vpnapi.io API key |
SHODAN_TOKEN | enrich_ips.py | Shodan API key |
Fallback:
enrich_ips.pyalso readsipinfo_token/abuseipdb_token/vpnapi_token/shodan_tokenfromconfig.jsonif the environment variables are not set..envtakes precedence.
Prerequisites
| Dependency | Required By | Setup |
|---|---|---|
Azure CLI (az) | Azure MCP Server (underlying auth), sentinel-ingestion-report skill (az rest for rule inventory, az monitor for tier classification) | Install: aka.ms/installazurecli. Authenticate: az login --tenant <tenant_id>. Set subscription: az account set --subscription <subscription_id> |
Note: Individual skills may have additional CLI dependencies documented in their own SKILL.md files. Check the skill file for skill-specific requirements before running a workflow.
When making Azure MCP Server calls, always pass tenant and subscription from config.json to avoid the multi-tenant auth issue (DefaultAzureCredential may pick up the wrong tenant).
π΄ SENTINEL WORKSPACE SELECTION - GLOBAL RULE
This rule applies to ALL skills and ALL Sentinel queries. Follow STRICTLY.
When executing ANY Sentinel query (via the Sentinel Data Lake query_lake MCP tool):
Workspace Selection Flow
- BEFORE first query: Call
list_sentinel_workspaces()to enumerate available workspaces - If exactly 1 workspace: Auto-select, display to user, proceed
- If multiple workspaces AND no prior selection in session:
- Display ALL workspaces with Name and ID
- ASK user: "Which Sentinel workspace should I use for this investigation? Select one or more, or 'all'."
- β STOP AND WAIT for explicit user response
- β DO NOT proceed until user selects
- If query fails on selected workspace:
- β STOP IMMEDIATELY
- Report: "β οΈ Query failed on [WORKSPACE_NAME]. Error: [ERROR_MESSAGE]"
- Display available workspaces
- ASK user to select a different workspace
- β DO NOT automatically retry with another workspace
π΄ PROHIBITED ACTIONS
| Action | Status |
|---|---|
| Auto-selecting workspace when multiple exist | β PROHIBITED |
| Switching workspaces after query failure without asking | β PROHIBITED |
| Proceeding with ambiguous workspace context | β PROHIBITED |
| Assuming workspace from previous conversation turns | β PROHIBITED |
| Making any workspace decision on behalf of user | β PROHIBITED |
β REQUIRED ACTIONS
| Scenario | Required Action |
|---|---|
| Multiple workspaces, none selected | STOP, list all, ASK user, WAIT |
| Query fails with table/workspace error | STOP, report error, ASK user, WAIT |
| Single workspace available | Auto-select, DISPLAY to user, proceed |
| Workspace already selected in session | Reuse selection, DISPLAY which workspace is being used |
π΄ KQL QUERY & HUNT EXECUTION - PRE-FLIGHT CHECKLIST
This checklist applies to EVERY KQL query, hunt, search, or data lookup β whether the user said "query", "hunt", "search", "look for", "find", "do we have X", "is there any Y", or just pasted an IoC/keyword/tool name.
π΄ MANDATORY FIRST ACTION β NO EXCEPTIONS: Before the first mcp_sentinel-data_query_lake or RunAdvancedHuntingQuery tool call of a conversation turn, you MUST complete Step 1 (discovery manifest + grep of queries/** and .github/skills/**). If you are about to write a KQL query and have not yet done a Priority 1 or Priority 2 discovery check for the user's keyword/topic, STOP and do the discovery first. A "hunt for X" request is NEVER an exception β it is the exact scenario the manifest exists to serve.
Self-check before every KQL tool call: "Did I grep_search queries/** for the user's keyword (tool name, IoC, threat name, table, operation) in this turn?" If no β STOP, do the discovery, then resume.
Exception β Skill & query library queries: When following a SKILL.md investigation workflow or using a query directly from the queries/ library, the queries are already verified and battle-tested. Skip Steps 1β4 and use those queries directly (substituting entity values as instructed). Step 0 (tool selection) and Step 5 (sanity-check zero results) still apply. Note: "I already know the keyword" does NOT qualify as this exception β you must have actually located the query file.
Before writing or executing any ad-hoc KQL query or hunt (i.e., not already from a SKILL.md file or queries/ file), complete these steps in order:
Step 0: Pick the Right Tool for the Lookback Window
Check the user's requested lookback against tool retention before writing KQL:
| Lookback | Tool | Why |
|---|---|---|
| β€ 30 days | RunAdvancedHuntingQuery (AH) | Default; free for Analytics-tier tables |
| > 30 days (31d, 60d, 90d, "last quarter", date ranges >30d) | mcp_sentinel-data_query_lake (Data Lake) | AH Graph API silently truncates results to 30d β no error, no warning. Using AH for 90d under-reports days 31β90. |
Self-check before every KQL tool call: "If lookback > 30 days, am I on Data Lake?" If not, switch.
Timestamp adaptation when switching AH β Data Lake:
- XDR-native tables (
Device*,Email*,Cloud*,Alert*,Identity*,Entra*): changeTimestampβTimeGenerated - Sentinel/LA tables (
SigninLogs,AuditLogs,SecurityAlert, etc.): already useTimeGeneratedin both tools - Column name differences (e.g.,
EntraIdSignInEvents.AccountUpnβSigninLogs.UserPrincipalName): see the EntraIdSignInEvents row in Step 3
Step 1: Check for Existing Verified Queries (MANDATORY FIRST STEP)
| Priority | Source | Action |
|---|---|---|
| 1st | Discovery manifest (.github/manifests/discovery-manifest.yaml) | Read the manifest and match by domain tag (e.g., identity, endpoint, email) or MITRE technique ID (e.g., T1078, T1566). The manifest indexes all query files and skills with title, path, domains, mitre, and prompt fields. Best when you know the security domain or ATT&CK technique β skips scanning individual files. |
| 2nd | Targeted grep_search (skills + queries) | grep_search for the specific table name (e.g., CloudAppEvents, OfficeActivity) or operation keyword (e.g., New-InboxRule, SecretGet) scoped to queries/** and .github/skills/**. The manifest lacks table-name and keyword fields β grep fills this gap for table-specific lookups. |
| 3rd | This file's Appendix | Check Ad-Hoc Query Examples for canonical patterns (SecurityAlertβSecurityIncident join, AuditLogs best practices, etc.) |
| 4th | KQL Search MCP | Use search_github_examples_fallback or validate_kql_query for community-published examples |
| 5th | Microsoft Learn MCP | Use microsoft_code_sample_search with language: "kusto" for official examples |
When to use which: If you know the domain ("identity threat") or MITRE technique (T1078) β start with Priority 1 (manifest). If you know the table name (AuditLogs) or specific operation (Set-Mailbox) β start with Priority 2 (grep). Both can be used together β manifest for breadth, grep for precision.
Short-circuit rule: If a suitable query is found in Priority 1 (manifest), Priority 2 (grep), or Priority 3 (Appendix), skip Steps 2β4 and use it directly (substituting entity values). These sources are already schema-verified and pitfall-aware. Step 5 (sanity-check zero results) still applies.
Step 2: Verify Table Schema
Before querying any table for the first time in a session, verify the schema:
- Use
search_tablesorget_table_schemafrom KQL Search MCP - Confirm column names, types, and which columns contain GUIDs vs human-readable values
- Check if the table exists in Data Lake vs Advanced Hunting (see Tool Selection Rule)
- β οΈ Column name hallucination: LLMs frequently use column names from one table on a different table. Common confusions:
SeverityvsAlertSeverity(SecurityIncident vs SecurityAlert),OSvsOSPlatform(Device* tables),IPAddressvsRemoteIP(varies by table),Entities(SecurityAlert only β not on SecurityIncident). Always verify the column exists on the specific table being queried.
Step 3: Check Known Table Pitfalls
Review this quick-reference before querying these tables:
| Table | Pitfall | Required Action |
|---|---|---|
| ALL Sentinel/LA tables (SigninLogs, AuditLogs, SecurityAlert, SecurityIncident, OfficeActivity, etc.) | Column is TimeGenerated, NOT Timestamp. Using Timestamp on these tables returns SemanticError: Failed to resolve column. This is the #1 most frequent Data Lake MCP error. LLMs default to Timestamp from AH query patterns | Data Lake: Always TimeGenerated. Advanced Hunting: Timestamp for XDR-native tables (Device*, Email*, Cloud*, Alert*, Identity*), TimeGenerated for Sentinel/LA tables. When adapting AH queries for Data Lake: replace ALL Timestamp β TimeGenerated |
| AADRiskySignIns | Table does NOT exist in Sentinel Data Lake. Querying it returns SemanticError: Failed to resolve table | Use AADUserRiskEvents instead (contains Identity Protection risk detections). For sign-in-level risk data, use SigninLogs with RiskLevelDuringSignIn and RiskState columns |
| AADUserRiskEvents | May have different retention than SigninLogs. IP column is IpAddress (lowercase 'p'), NOT IPAddress. Using IPAddress returns Failed to resolve scalar expression. LLMs default to IPAddress (matching SigninLogs convention) and consistently get this wrong. Timestamp column is ActivityDateTime, NOT TimeGenerated β using TimeGenerated silently returns 0 results (column exists but is ingestion time, not event time). Location is a JSON string β use parse_json(Location).countryOrRegion | Cross-reference with SigninLogs RiskLevelDuringSignIn for complete picture. Always use IpAddress (lowercase 'p') and ActivityDateTime for time filtering |
| AADUserRiskEvents | suspiciousAuthAppApproval naming trap: Despite the name, this detection is about MFA Authenticator push approval patterns (MITRE T1621 β MFA Request Generation / MFA Fatigue), NOT OAuth app consent grants. LLMs consistently misinterpret this as app registration/consent abuse and incorrectly recommend app-registration-posture audits. The AdditionalInfo field contains "mitreTechniques": "T1621" confirming MFA focus. No corresponding entries appear in AuditLogs consent operations | When suspiciousAuthAppApproval appears: investigate MFA patterns and sign-in anomalies (user-investigation, authentication-tracing). NEVER recommend app-registration-posture or search for OAuth consent grants based solely on this risk event |
| AIAgentsInfo | Advanced Hunting only β does NOT exist in Sentinel Data Lake. Multiple records per agent (state snapshots); KnowledgeDetails is a string containing a JSON array of JSON strings; IsGenerativeOrchestrationEnabled may be null | Always use RunAdvancedHuntingQuery. Deduplicate with summarize arg_max(Timestamp, *) by AIAgentId. Double-parse KnowledgeDetails: mv-expand KnowledgeRaw = parse_json(KnowledgeDetails) | extend KnowledgeJson = parse_json(tostring(KnowledgeRaw)). Treat null GenAI flag as unknown. Table is in Preview β schema may change |
| AuditLogs | InitiatedBy, TargetResources are dynamic fields | Always wrap in tostring() before using has operator |
| AuditLogs | OperationName values vary across providers β e.g., "Reset user password", "Change user password", "Self-service password reset" are all different values. Consent lifecycle trap: "Consent to application" is only 1 of 4+ operations. has_any() requires exact word matches and is unpredictable | Use broad has "keyword" for discovery (e.g., has "password", has "role"), then refine with summarize count() by OperationName. For consent investigations use queries/identity/app_credential_management.md Query 5 which has the complete operation list |
| AzureDiagnostics | Legacy table β Microsoft explicitly documents that "Querying legacy tables such as AzureDiagnostics is not supported" in Data Lake. mcp_sentinel-data_query_lake returns SemanticError: Failed to resolve table even though the table exists in the workspace. Lake-only ingestion is also not supported (No in connector reference). The portal may show the workspace as "Data Lake integrated" but individual tables have eligibility flags β this table is stuck on Analytics tier. This is NOT the same table as AzureActivity. AzureDiagnostics = resource-specific diagnostic logs (Key Vault data plane: SecretGet, Authentication, VaultGet; SQL auditing; Firewall logs; App Service logs, etc.). AzureActivity = ARM control plane operations (resource creation/deletion, policy actions, role assignments, deployments). Confusing the two leads to querying the wrong table and missing critical data plane evidence | If Data Lake returns "Failed to resolve table", immediately try RunAdvancedHuntingQuery (AH can query Analytics-tier tables). Do NOT fall back to AzureActivity β it contains completely different data. Key columns: ResourceType (e.g., VAULTS), OperationName (e.g., SecretGet), CallerIPAddress, ResultType, Resource (resource name), Category (e.g., AuditEvent). Filter pattern: AzureDiagnostics | where ResourceType == "VAULTS" | where Resource =~ "<vault-name>". For Key Vault investigations, look for OperationName values like SecretGet, SecretList, Authentication, VaultGet |
| BehaviorEntities / BehaviorInfo | Advanced Hunting only β does NOT exist in Sentinel Data Lake. Table is in Preview. Two companion tables: BehaviorInfo (1 row per behavior β description, MITRE techniques, time window) and BehaviorEntities (N rows per behavior β entity decomposition). Populated by MCAS and Sentinel UEBA only β if these services aren't deployed, queries return 0 rows. Categories and AttackTechniques are JSON strings, not arrays β must parse_json() before mv-expand. K8s entity AdditionalFields contains deeply nested JSON with $id/$ref circular references. Low volume table (behavioral detections, not raw events). Significant overlap with SecurityAlert (same MCAS/MDC sources) but provides below-alert-threshold signals and pre-decomposed entity rows without parsing the SecurityAlert Entities JSON blob | Always use RunAdvancedHuntingQuery. Join tables on BehaviorId. Key ActionTypes: ImpossibleTravelActivity, MultipleFailedLoginAttempts, MassDownload, UnusualAdditionOfCredentialsToAnOauthApp, K8S.NODE_DriftBlocked, K8S.NODE_MalwareBlocked. Entity rows have EntityRole = Impacted or Related. Use for enriching user/IP investigations with MCAS/UEBA context. See queries/cloud/behavior_entities.md for verified query patterns |
| CloudAppEvents | Extremely high-volume table β ingests ALL M365 unified audit events (mail reads, file access, Teams, admin ops, etc.). Queries without selective early filters will timeout or get cancelled. RawEventData is a large JSON blob (often 5-100+ KB per row). Performance killer #1: tostring(RawEventData) has "value" β forces full JSON serialization on every row before substring search. Performance killer #2: Repeated parse_json(RawEventData) calls in separate extend statements β re-parses the entire blob per call. Performance killer #3: AccountDisplayName has "partial" β substring match without index; use AccountObjectId (GUID, indexed) or AccountDisplayName =~ (exact, case-insensitive). AccountId is a GUID (Entra ObjectId), NOT a UPN β filtering AccountId in~ ("user@domain.com") returns 0 results silently. Use AccountObjectId (identical GUID) for indexed lookups, or AccountDisplayName for display-name-based filtering. To filter by UPN, resolve to ObjectId first via Graph API: GET /v1.0/users/<UPN>?$select=id. ApplicationId is int, NOT string β this is a Defender-internal integer, NOT the Entra AppId GUID. Using string GUID arrays with in operator returns SEM0025: type mismatch. To resolve app names from Entra GUID AppIds, use SigninLogs/AADNonInteractiveUserSignInLogs (which have AppId as string + AppDisplayName), or OAuthAppInfo (which uses OAuthAppId as string). Inbox rule queries: For New-InboxRule/Set-InboxRule/Set-Mailbox, ALWAYS also query OfficeActivity (Exchange workload) β these tables are complementary, not alternatives. CloudAppEvents provides ActionType-based summaries and AccountDisplayName, but OfficeActivity provides the full Parameters JSON (forwarding targets: ForwardTo, RedirectTo, ForwardingSmtpAddress), per-operation ClientIP, and additional Exchange audit operations (MoveToDeletedItems, MailItemsAccessed, Send) critical for post-compromise forensics. When investigating mailbox manipulation, query BOTH tables. ActionType is CamelCase β use contains not has for partial matching (e.g., ActionType contains "Sentinel" not has) | Filter order: Timestamp/TimeGenerated first β ActionType (most selective, eliminates 99%+ rows) β identity filter (AccountObjectId preferred). RawEventData: Parse ONCE with extend ParsedData = parse_json(RawEventData) (or parse_json(tostring(RawEventData)) in AH), then extract all fields from ParsedData. NEVER use tostring(RawEventData) has "x" for filtering β extract the specific field instead. For inbox rule investigations, query BOTH: (1) CloudAppEvents for ActionType summary + identity context, (2) OfficeActivity | where OfficeWorkload == "Exchange" for full Parameters JSON, ClientIP, and additional Exchange operations (MoveToDeletedItems, MailItemsAccessed, Send). Never rely on CloudAppEvents alone for mailbox forensics |
| DataSecurityEvents | Advanced Hunting only β requires Insider Risk Management opt-in. SensitiveInfoTypeInfo is Collection(String) NOT native dynamic β requires double parse_json(). Contains SIT GUIDs not names. Copilot events ("Risky prompt entered in Copilot", "Sensitive response received in Copilot") can dominate 90%+ of volume. ObjectId is the file identifier β ObjectName/ObjectType do NOT exist despite documentation. Label columns: SensitivityLabelId (string, can be comma-separated), PreviousSensitivityLabelId (string, label change events), SharepointSiteSensitivityLabelId (string), RiskyAIUsageSensitivityLabelsInfo (Collection(String), mostly [null]). Label data is sparse in SIT-dominant environments but significant in Purview-mature orgs | Always use RunAdvancedHuntingQuery. Double-parse: mv-expand SIT = parse_json(tostring(SensitiveInfoTypeInfo)) | extend SITJson = parse_json(tostring(SIT)). Pre-filter with where SensitiveInfoTypeInfo has "<GUID>" before mv-expand. Use split(SensitivityLabelId, ",") for multi-GUID label values. Use data-security-analysis skill for SIT and label GUID-to-name resolution. If table returns 0 rows, check IRM opt-in status |
| DeviceCustom* (CDC Tables) | Requires MDE Custom Data Collection (CDC) rules. These tables (DeviceCustomFileEvents, DeviceCustomScriptEvents, DeviceCustomImageLoadEvents, DeviceCustomNetworkEvents) do NOT exist in workspaces without CDC policies. They extend standard MDE telemetry beyond default thresholds. Key per-table pitfalls: DeviceCustomScriptEvents β script body is ScriptContent, NOT AdditionalFields (SemanticError); AMSI-only (Node.js/Go/Rust invisible). DeviceCustomNetworkEvents β coverage varies by CDC policy; some environments only collect Kerberos events, run discovery query first. DeviceCustomFileEvents β fills gaps when standard DeviceFileEvents returns 0 for known active directories. DeviceCustomImageLoadEvents β reveals native addons (.node modules, Python C extensions) | CDC tables are optional β if "Failed to resolve table", skip gracefully and note the telemetry gap. Query order: standard table first β if 0 results and activity is expected β try CDC equivalent β if CDC table doesn't exist β note as telemetry limitation |
| DeviceInfo | Internet-facing detection pitfall: ExposureGraphNodes.rawData.IsInternetFacing, rawData.exposedToInternet, and rawData.isCustomerFacing are all unreliable for determining actual internet exposure. isCustomerFacing is a business-function flag (NOT internet exposure). IsInternetFacing/exposedToInternet are not populated in many environments. LLMs default to querying these ExposureGraph properties and get null results. MachineTags column renamed: The old MachineTags column no longer exists β using it returns Failed to resolve scalar expression. It was split into three columns: DeviceManualTags (admin-set), DeviceDynamicTags (auto-assigned by rules), RegistryDeviceTag (set via registry). MS Learn may still reference MachineTags but the AH schema has only the new names. The Defender API GetDefenderMachine still returns machineTags (maps to DeviceManualTags in AH) | Authoritative source: Use DeviceInfo.IsInternetFacing == true (bool column). MDE maintains this via external scans + observed inbound connections; auto-expires after 48h. Extract details from AdditionalFields: extractjson("$.InternetFacingReason", AdditionalFields) (values: PublicScan, InboundConnection), InternetFacingLocalPort, InternetFacingPublicScannedIp. See queries/network/internet_exposure_analysis.md Query 1 and MS Docs. For inbound scan detail: DeviceNetworkEvents with ActionType == "InboundInternetScanInspected". Tags: Use DeviceManualTags, DeviceDynamicTags, RegistryDeviceTag β NEVER MachineTags |
| DeviceTvmSoftwareVulnerabilities / DeviceTvmSoftwareInventory / DeviceTvmSecureConfigurationAssessment / SecurityRecommendation | Advanced Hunting only β Defender TVM tables do NOT exist in Sentinel Data Lake. DeviceName is stored as FQDN (e.g., myserver.contoso.com), NOT short hostname. Using DeviceName =~ 'hostname' returns 0 results. Timestamp column pitfall: DeviceTvmSoftwareVulnerabilities and DeviceTvmSoftwareInventory are point-in-time snapshot tables with NO Timestamp column β using summarize arg_max(Timestamp, *) or any Timestamp filter returns Failed to resolve scalar expression. DeviceTvmSecureConfigurationAssessment DOES have Timestamp. LLMs assume all TVM tables share the same schema and consistently add Timestamp where it doesn't exist | Always use RunAdvancedHuntingQuery. Per-device filter: Use DeviceName startswith '<hostname>' (matches both short and FQDN). NEVER use =~ with short names. No deduplication needed on DeviceTvmSoftwareVulnerabilities / DeviceTvmSoftwareInventory β each row is already the latest state. For "last seen" or time context, join with DeviceInfo (which has Timestamp). For vulnerability investigations, use the exposure-investigation skill |
| EntraIdSignInEvents | Case-sensitivity pitfall: Capital I in SignIn β EntraIdSigninEvents (lowercase i) fails. FetchAdvancedHuntingTablesDetailedSchema does NOT index this table β use inline getschema. Covers both interactive AND non-interactive sign-ins β default choice over SigninLogs / AADNonInteractiveUserSignInLogs for AH queries (β€30d). SPN sign-ins use EntraIdSpnSignInEvents. Column mapping vs Sentinel tables: ErrorCode (int) vs ResultType (string), AccountUpn vs UserPrincipalName, Application/ApplicationId vs AppDisplayName/AppId, Country/City as direct strings (no parse_json(LocationDetails)), RequestId vs OriginalRequestId. LogonType pitfall: JSON array string (["nonInteractiveUser"]) β use has not ==. RiskLevelDuringSignIn/RiskState are int (use 0/1/10/50/100). ConditionalAccessStatus is int (0=applied, 1=failed, 2=not applied) | AH queries (β€30d): Default to EntraIdSignInEvents. Data Lake / >30d: Fall back to SigninLogs + AADNonInteractiveUserSignInLogs (union, 90+ day retention). Map column names when adapting between the two. MS Learn reference |
| ExposureGraphNodes / ExposureGraphEdges | Advanced Hunting only β Exposure Management graph tables do NOT exist in Sentinel Data Lake | Always use RunAdvancedHuntingQuery. Uses Timestamp. See exposure-investigation skill for verified query patterns |
| GraphAPIAuditEvents | Advanced Hunting only β does NOT exist in Sentinel Data Lake. ApplicationId is string (Entra AppId GUID), but ResponseStatusCode is string β use toint(ResponseStatusCode) for numeric comparisons or == "403" for string matching. Column name mismatches vs MicrosoftGraphActivityLogs (Data Lake): AH uses ApplicationId / AccountObjectId / ServicePrincipalId; Data Lake uses AppId / UserId / ServicePrincipalId. Scopes, Roles, SessionId, UniqueTokenId, DurationMs are Data Lake only. TargetWorkload, EntityType are AH only. OAuthAppInfo join: Use OAuthAppInfo.OAuthAppId (NOT ApplicationId β column doesn't exist on OAuthAppInfo) | Always use RunAdvancedHuntingQuery. For >30d investigations or token/session correlation, use MicrosoftGraphActivityLogs in Data Lake. Map column names when switching platforms. See queries/cloud/graph_api_security_monitoring.md for verified query patterns |
| IdentityAccountInfo | Advanced Hunting only β does NOT exist in Sentinel Data Lake. Table is in Preview β schema may change and many fields are not yet populated (EnrolledMfas, TenantMembershipType, AuthenticationMethod, CriticalityLevel, DefenderRiskLevel). Multiple snapshot records per account; AssignedRoles and GroupMembership are dynamic arrays. SourceProviderRiskLevel values vary by provider (AAD=High/Medium/Low, Okta=HIGH/MEDIUM, SailPoint=HIGH). AccountStatus vocabularies differ across providers (AAD: Enabled/Disabled/Deleted; SailPoint: ACTIVE/NONE/INACTIVE; Okta: STAGED/ACTIVE/DEPROVISIONED; CyberArk: ACTIVE/INVITED/SUSPENDED). IdentityInfo UAC join pitfall: array_index_of(null_dynamic, "value") returns null (not -1). Since null != -1 is true in KQL, querying array_index_of(UserAccountControl, "PasswordNeverExpires") != -1 without first filtering isnotnull(UserAccountControl) incorrectly returns true for ALL null-UAC accounts (~99% of identities), massively inflating PwdNeverExpires counts | Always use RunAdvancedHuntingQuery. Deduplicate with summarize arg_max(Timestamp, *) by AccountId (per-account) or by IdentityId (cross-provider). Parse roles/groups: mv-expand Role = parse_json(AssignedRoles). IdentityId links accounts across providers β one identity can have accounts from multiple sources. For enrichment, join with IdentityInfo on IdentityId (not AccountUpn β avoids 1:many inflation). When using UserAccountControl from IdentityInfo: MUST add where isnotnull(UserAccountControl) BEFORE computing boolean flags with array_index_of. Use identity-posture skill for comprehensive identity posture reports |
| OfficeActivity | Mailbox forwarding/redirect rules live here, NOT in AuditLogs | Filter by OfficeWorkload == "Exchange" and Operation in~ ("New-InboxRule", "Set-InboxRule", "Set-Mailbox", "UpdateInboxRules"). Check Parameters for ForwardTo, RedirectTo, ForwardingSmtpAddress. This table is the primary source for detecting email exfiltration via forwarding rules (MITRE T1114.003 / T1020). |
| OfficeActivity | Parameters and OperationProperties are string fields containing JSON | Use contains or has for keyword matching, then parse_json(Parameters) to extract specific values. Do NOT query AuditLogs for mailbox rule changes β they only appear in OfficeActivity (Exchange workload). |
| OAuthAppInfo | Advanced Hunting only. Key column is OAuthAppId (string, Entra AppId GUID), NOT ApplicationId β column doesn't exist on this table. Multiple snapshot rows per app; Permissions is dynamic. Other key columns: AppName, PrivilegeLevel, AppOrigin (Internal/External), AppStatus, IsAdminConsented, VerifiedPublisher. When cross-referencing with GraphAPIAuditEvents, join on OAuthAppInfo.OAuthAppId == GraphAPIAuditEvents.ApplicationId | Always use RunAdvancedHuntingQuery. Deduplicate with summarize arg_max(Timestamp, *) by OAuthAppId. For app permission audits, use app-registration-posture skill |
| SecurityAlert | Status field is immutable β always "New" regardless of actual state | MUST join with SecurityIncident to get real Status/Classification (see Appendix pattern) |
| SecurityAlert | ProviderName is an internal identifier (e.g., MDATP, ASI Scheduled Alerts, MCAS) and rolls up to generic names like Microsoft XDR at the incident level | Use ProductName for product grouping (e.g., Azure Sentinel, Microsoft Defender Advanced Threat Protection, Microsoft Data Loss Prevention). Also available: ProductComponentName (e.g., Scheduled Alerts, NRT Alerts). Translate raw values to current branding in reports. |
| SecurityIncident | AlertIds contains SystemAlertId GUIDs, NOT usernames, IPs, or entity names | NEVER filter AlertIds by entity name. Instead: query SecurityAlert first filtering by Entities has '<entity>', then join to SecurityIncident on AlertId |
| SecurityIncident | Phantom incidents with empty AlertIds: Many Defender XDR-synced incidents have AlertIds = [] β these never appear in the portal or Graph API and inflate closed incident counts. TimeGenerated > ago(7d) also captures old incidents with recent status updates, further inflating counts | For accurate closed counts: (1) Use CreatedTime (not TimeGenerated) for time-windowed queries, (2) Add | where array_length(AlertIds) > 0 to exclude phantom incidents |
| SecurityIncident / SecurityAlert | IncidentNumber and SystemAlertId are Sentinel-local IDs β Triage MCP uses Defender XDR IDs | Use ProviderIncidentId for Triage MCP lookups. See Sentinel β Defender XDR ID Mapping for full mapping |
| SentinelHealth | SentinelResourceType values use title-case with a space: "Analytics Rule", NOT "Analytic rule". LLMs consistently generate the wrong casing/spelling, returning 0 results despite 30k+ rows in the table | Always use SentinelResourceType == "Analytics Rule" (capital A, capital R, "Analytics" with an 's'). Other valid values: "Data connector", "Automation rule". If query returns 0 rows, check this filter first |
| SigninLogs / AADNonInteractiveUserSignInLogs | DeviceDetail, LocationDetails, ConditionalAccessPolicies, Status may be dynamic OR string depending on workspace (Data Lake workspaces store them as strings). AADNonInteractiveUserSignInLogs stores these as string always | Always use tostring(parse_json(DeviceDetail).operatingSystem) β works for both types. Direct dot-notation DeviceDetail.operatingSystem fails with SemanticError when column is string type. Same applies to Status (use parse_json(Status).errorCode), ConditionalAccessPolicies β use parse_json() before dot-access or mv-expand |
| SigninLogs | Location is a string column, NOT dynamic. Dot-notation like Location.countryOrRegion will fail with SemanticError | Use parse_json(LocationDetails).countryOrRegion for geographic sub-properties. Location works with dcount(), has, isnotempty() but NOT dot-property access |
| Signinlogs_Anomalies_KQL_CL | Custom _CL table names are case-sensitive. Table uses lowercase 'l' in "logs" β Signinlogs NOT SigninLogs. LLMs auto-correct this to match SigninLogs | Always copy exact table name Signinlogs_Anomalies_KQL_CL. If SemanticError: Failed to resolve table, verify casing first. If still fails, table may not exist in the workspace β skip gracefully |
| UnifiedAgentObservability | Sentinel Data Lake system table (Agent 365 / A365 Observability connector). Lake-only β NOT in Advanced Hunting or Triage MCP. Fails with SemanticError: Failed to resolve table under a workspace GUID; search_tables doesn't index it. Uses TimeGenerated. Quick gotchas: ToolName is a top-level column (NOT inside AdditionalFields β AF.ToolName silently returns null); InvokeAgent rows have SrcAgentId = "00000000-..." (zero-GUID); ActorUsername = "N/A" on ExecuteToolBySDK rows. See queries/cloud/agent365_observability.md Query 9 for the full schema reference, cross-source join patterns with CloudAppEvents CopilotInteraction (incl. Messages[].JailbreakDetected PascalCase pitfall), and validated jailbreak β tool-call chain queries. | Pass workspaceId: "default" for single-table queries. For cross-scope joins with workspace tables: pass workspaceId: <workspace-guid> and reference as workspace("default").UnifiedAgentObservability. parse_json() dynamic columns before dot-access or mv-expand. |
| Anomalies | Sentinel UEBA built-in anomaly rule results (distinct from BehaviorInfo/BehaviorAnalytics). Tactics and Techniques are JSON strings, not arrays β must parse_json() before make_set(). AnomalyReasons is a dynamic array of objects with IsAnomalous (bool) and Name fields β filter tobool(reason.IsAnomalous) == true to extract only the anomalous flags. DeviceInsights.ThreatIntelIndicatorType frequently shows BruteForce on corporate/Azure egress IPs (TITAN reputation false positive). UserPrincipalName is populated β use =~ for user-scoped queries (the entity-matching mv-apply on Entities is NOT required). Score 0.0β1.0: β₯0.7 High, 0.3β0.7 Medium, <0.3 Low. Available in both Data Lake and Advanced Hunting | Use UserPrincipalName =~ for user filtering. Always parse_json(Tactics) and parse_json(Techniques) before aggregation. Filter AnomalyReasons with tobool(reason.IsAnomalous) == true. Do NOT confuse with BehaviorInfo (MCAS, AH-only) or BehaviorAnalytics (raw UEBA events, Data Lake-only) β three separate tables |
π‘ CDC Telemetry Escalation Pattern: When standard MDE tables return 0 results for activity you have evidence exists, check whether
DeviceCustom*tables are available. Not all environments have CDC enabled β if the tables don't resolve, document the telemetry gap rather than assuming absence of activity.
Step 3b: Common KQL Anti-Patterns (All Tables)
These universal KQL mistakes are frequent LLM errors regardless of which table is queried:
| Anti-Pattern | Error | Fix |
|---|---|---|
mv-expand on string column containing JSON | expanded expression expected to have dynamic type | mv-expand parsed = parse_json(StringColumn) β parse_json() BEFORE mv-expand |
dcount() on dynamic column | argument #1 cannot be dynamic | dcount(tostring(DynamicColumn)) β cast to scalar |
bin() missing argument | bin(): function expects 2 argument(s) | Always provide both: bin(TimeGenerated, 1h) |
iff() with mismatched branch types | @then data type (real) must match @else (long) | Cast both branches: iff(cond, todouble(x), todouble(y)) |
| Joining on dynamic column | join key 'X' is of a 'dynamic' type | Cast before join: | extend AlertId = tostring(AlertId) | join ... |
Duplicate column in union | column named 'X' already exists | Use project-away or project-rename before union |
prev()/next() on unserialized rowset | Function 'prev' cannot be invoked in current context | Add | serialize before prev(), next(), row_cumsum(), row_number() |
Step 4: Validate Before Execution
- For complex queries: use
validate_kql_queryto check syntax - Ensure datetime filter is the FIRST filter in the query
- Use
takeorsummarizeto limit results
Step 5: Sanity-Check Zero Results
If a query returns 0 results for a commonly-populated table, STOP and verify:
| Check | Action |
|---|---|
| Is the query logic correct? | Review join conditions, filter values, and field types |
| Am I filtering on GUIDs where I used a name (or vice versa)? | Check schema for field content type |
| Is the date range appropriate? | Ensure the time filter covers the expected data window |
| Does the table exist in this data source? | Try the other KQL execution tool if applicable |
β DO NOT report "no results found" until you have verified the query itself is correct. A zero-result query may indicate a bad query, not absence of data.
Step 6: Execute Before Sharing
Any KQL query block presented to the user β inline or in a π¬ Take Action portal handoff β MUST be valid, tested, and confirmed to return results before sharing. The only exception is when 0 results is the intended outcome AND the reasoning is communicated to the user. If a query returns 0 unexpectedly, apply Step 5 sanity-check, fix it, and re-run. Do not paste untested KQL into chat.
π΄ PROHIBITED Actions
| Action | Status |
|---|---|
Calling mcp_sentinel-data_query_lake or RunAdvancedHuntingQuery before doing a Priority 1 (manifest) or Priority 2 (grep) discovery check for the keyword/topic in this turn | β PROHIBITED |
| Treating a "hunt for X" / "search for X" / "look for X" / "find Y" / "do we have Z" request as exempt from Step 1 | β PROHIBITED |
| Writing KQL from scratch without completing Steps 1-2 | β PROHIBITED |
Filtering SecurityIncident.AlertIds by entity names | β PROHIBITED |
Reading SecurityAlert.Status as current investigation status | β PROHIBITED |
| Reporting 0 results without sanity-checking the query logic | β PROHIBITED |
| Sharing an investigative KQL query with the user without executing it first | β PROHIBITED |
Using Timestamp on Sentinel/LA tables in Data Lake queries | β PROHIBITED β use TimeGenerated |
Executing RunAdvancedHuntingQuery when user-requested lookback > 30 days | β PROHIBITED β AH silently truncates to 30d; use mcp_sentinel-data_query_lake instead |
π΄ EVIDENCE-BASED ANALYSIS - GLOBAL RULE
This rule applies to ALL skills, ALL queries, and ALL investigation outputs.
Core Principle
Base ALL findings strictly on data returned by MCP tools. Never invent, assume, or extrapolate data that was not explicitly retrieved.
Required Behaviors
| Scenario | Required Action |
|---|---|
| Query returns 0 results | State explicitly: "β No [anomaly/alert/event type] found in [time range]" |
| Field is null/missing in response | Report as "Unknown" or "Not available" - never fabricate values |
| Partial data available | State what WAS found and what COULD NOT be verified |
| User asks about data not queried | Query first, then answer - never guess based on "typical patterns" |
π΄ PROHIBITED Actions
| Action | Status |
|---|---|
| Inventing IP addresses, usernames, or entity names | β PROHIBITED |
| Assuming counts or statistics not in query results | β PROHIBITED |
| Describing "typical behavior" when no baseline was queried | β PROHIBITED |
| Omitting sections silently when no data exists | β PROHIBITED |
| Using phrases like "likely", "probably", "typically" without evidence | β PROHIBITED |
β REQUIRED Output Patterns
When data IS found:
π Found 47 failed sign-ins from IP 203.0.113.42 between 2026-01-15 and 2026-01-22.
Evidence: SigninLogs query returned 47 records with ResultType=50126.
When NO data is found:
β
No failed sign-ins detected for user@domain.com in the last 7 days.
Query: SigninLogs | where UserPrincipalName =~ 'user@domain.com' | where ResultType != 0
Result: 0 records
When data is PARTIAL:
β οΈ Sign-in data available, but DeviceEvents table not accessible in this workspace.
Verified: 12 successful authentications from 3 IPs
Unable to verify: Endpoint process activity (table not found)
Risk Assessment Grounding
When assigning risk levels, cite the specific evidence:
| Risk Level | Evidence Required |
|---|---|
| High | Must cite β₯2 concrete findings (e.g., "AbuseIPDB score 95 + 47 failed logins in 1 hour") |
| Medium | Must cite β₯1 concrete finding with context (e.g., "New IP not in 90-day baseline") |
| Low | Must explain why low despite investigation (e.g., "IP is known corporate VPN egress") |
| Informational | Must still cite what was checked: "No alerts, no anomalies, no risky sign-ins found" |
Emoji Formatting for Investigation Output
Use color-coded emojis consistently throughout investigation reports to make risks, mitigating factors, and status immediately scannable:
| Category | Emoji | When to Use |
|---|---|---|
| High risk / critical finding | π΄ | High-severity alerts, confirmed compromise, high abuse scores, active threats |
| Medium risk / warning | π | Medium-severity detections, unresolved risk states, suspicious but unconfirmed activity |
| Low risk / minor concern | π‘ | Low-severity detections, informational anomalies, items needing review but not urgent |
| Mitigating factor / positive | π’ | MFA enforced, phishing-resistant auth, clean threat intel, risk remediated/dismissed |
| Informational / neutral | π΅ | Contextual notes, baseline data, configuration details, reference information |
| Absence confirmed / clean | β | No alerts found, no anomalies, clean query results, verified safe |
| Needs attention / action item | β οΈ | Unresolved risks, report-only policies, recommendations requiring human decision |
| Data not available | β | Table not accessible, partial data, unable to verify |
Example usage in summary tables:
| Factor | Finding |
|--------|---------|
| π’ **Auth Method** | Phishing-resistant passkey (device-bound) β strong credential |
| π **IP Reputation** | VPN exit node with 14 abuse reports (low confidence 5%) |
| π΄ **Unresolved Risk** | `unfamiliarFeatures` detection still atRisk β needs admin action |
| β οΈ **CA Policy Gap** | "Require MFA for risky sign-ins" is report-only, not enforcing |
| β
**MFA Enforcement** | MFA required and passed on 16/18 sign-ins |
Apply these emojis in:
- Summary assessment tables (prefix the factor name)
- Section headers when results indicate clear risk or clean status
- Inline findings where risk/mitigation context helps readability
- Recommendation items (prefix with β οΈ for action items, π’ for confirmations)
Explicit Absence Confirmation
After every investigation section, confirm what was checked even if nothing was found:
## Security Alerts
β
No security alerts involving user@domain.com in the last 30 days.
- Checked: SecurityAlert table (0 matches)
- Checked: SecurityIncident for associated entities (0 matches)
Technical Context Enrichment
When explaining technical concepts, use Microsoft Learn MCP to ground responses in official documentation:
| When to Use | Example |
|---|---|
| Explaining error codes | Search for "SigninLogs ResultType 50126" to get official meaning |
| Describing attack techniques | Search for "AiTM phishing" or "token theft" for official remediation guidance |
| Clarifying Azure/M365 features | Search for "Conditional Access device compliance" for accurate configuration details |
| Interpreting log fields | Search for table schema documentation when field meaning is unclear |
Workflow:
microsoft_docs_searchβ Find relevant articlesmicrosoft_docs_fetchβ Get complete details when needed- Cite the source in your response (include URL when providing technical guidance)
π΄ REMEDIATION OUTPUT POLICY - GLOBAL RULE
Applies to ALL skills and investigation outputs.
Never generate executable commands that change tenant, mailbox, user, device, or resource state. Route the admin through audited UI paths instead.
β Allowed
- Portal deep links with navigation steps (Defender XDR, Entra, EAC, Purview, Azure Portal)
- Natural-language instructions describing what the admin should do
- Read-only verification KQL (labeled as such) and read-only Graph
GETcalls
β Prohibited
- State-changing PowerShell (
Remove-*,Set-*,New-*,Disable-*,Revoke-*) azCLI write operations (create,set,update,delete)- Graph API write calls (
Invoke-MgGraphRequest -Method PATCH/POST/PUT/DELETE,curl -X POST, etc.) - Any snippet the admin could paste to mutate state β even labeled "for reference" or "optional"
Exceptions
- Skill-defined actions β if a skill's SKILL.md explicitly specifies state-changing commands as part of its workflow (e.g.,
detection-authoring), those are allowed within that skill's scope. - User explicitly requests a command β confirm the ask, then generate with
-WhatIf/ dry-run by default and flag the destructive operation.
Available Skills
BEFORE starting any investigation, detect if user request matches a specialized skill:
| Category | Skill | Description | Trigger Keywords |
|---|---|---|---|
| π Core | computer-investigation | Device security analysis (alerts, compliance, vulnerabilities, process/network/file events) | "investigate computer", "investigate device", "investigate endpoint", "check machine", hostname |
| π Core | honeypot-investigation | Honeypot attack analysis with threat intel and executive reports | "honeypot", "attack analysis", "threat actor" |
| π Core | incident-investigation | Defender XDR / Sentinel incident triage with recursive entity investigation | "investigate incident", "incident ID", "analyze incident", "triage incident", incident number |
| π Core | ioc-investigation | IoC analysis for IPs, domains, URLs, file hashes with TI enrichment | "investigate IP", "investigate domain", "investigate URL", "investigate hash", "IoC", "is this malicious", "threat intel", IP/domain/URL/hash |
| π Core | user-investigation | Entra ID user security analysis (sign-ins, MFA, anomalies, incidents, Identity Protection) | "investigate user", "security investigation", "check user activity", UPN/email |
| π Auth | authentication-tracing | Authentication chain forensics (SessionId, token reuse, geographic anomalies) | "trace authentication", "SessionId analysis", "token reuse", "geographic anomaly", "impossible travel" |
| π Auth | ca-policy-investigation | Conditional Access policy forensics and bypass detection | "Conditional Access", "CA policy", "device compliance", "policy bypass", "53000", "50074", "530032" |
| π Behavioral | scope-drift-detection/device | Device process baseline drift analysis with weighted Drift Score | "device drift", "device process drift", "endpoint drift", "process baseline", "device behavioral change", "device scope drift" |
| π Behavioral | scope-drift-detection/spn | SPN behavioral drift (90d baseline vs 7d recent) with weighted Drift Score | "scope drift", "service principal drift", "SPN behavioral change", "SPN drift", "baseline deviation", "access expansion", "automation account drift" |
| π Behavioral | scope-drift-detection/user | User sign-in drift (Interactive + Non-Interactive Drift Scores) | "user drift", "user behavioral change", "user scope drift", "UPN drift", "sign-in drift", "user baseline deviation" |
| π‘οΈ Posture | exposure-investigation | Vulnerability & Exposure Management (CVEs, configs, attack paths, critical assets) | "vulnerability report", "exposure report", "CVE assessment", "security posture", "TVM", "attack paths", "critical assets" |
| π‘οΈ Posture | ai-agent-posture | AI agent security audit (Copilot Studio, auth gaps, MCP tools, XPIA risk) | "AI agent posture", "agent security audit", "Copilot Studio agents", "agent inventory", "unauthenticated agents", "XPIA risk", "agent sprawl" |
| π‘οΈ Posture | app-registration-posture | App registration posture (permissions, ownership, credentials, KQL attack chains) | "app registration posture", "service principal permissions", "dangerous app permissions", "app credential abuse", "SPN lateral movement", "app consent grant" |
| π‘οΈ Posture | email-threat-posture | MDO email threat posture (phishing, DMARC/DKIM/SPF, ZAP, Safe Links) | "email threat report", "email security posture", "phishing report", "MDO report", "ZAP effectiveness", "DMARC report" |
| π Data | data-security-analysis | DataSecurityEvents analysis (SIT access, sensitivity labels, DLP, Copilot exposure) | "data security", "sensitive information type", "SIT access", "DLP events", "DataSecurityEvents", "sensitivity label", "label downgrade", "Copilot label exposure" |
| π‘οΈ Posture | identity-posture | Identity posture via IdentityAccountInfo (multi-provider, privileged accounts, hygiene) | "identity posture", "identity report", "account inventory", "privileged accounts", "stale accounts", "identity hygiene", "IdentityAccountInfo" |
| π Viz | geomap-visualization | Interactive world map for attack origins and IP geolocation | "geomap", "world map", "attack map", "show on map", "attack origins" |
| π Viz | heatmap-visualization | Interactive heatmap for time-based activity patterns | "heatmap", "show heatmap", "visualize patterns", "activity grid" |
| π Viz | svg-dashboard | SVG dashboards (KPI cards, charts, tables) from reports or ad-hoc data | "generate SVG dashboard", "create a visual dashboard", "visualize this report", "SVG from the report", "create SVG chart" |
| π Scan | threat-pulse | 15-min broad security scan across 7 domains with prioritized drill-down recommendations | "threat pulse", "quick scan", "security pulse", "morning hunt", "what should I focus on", "what can you do", "where do I start", "what's going on" |
| π§ Tooling | detection-authoring | Create/deploy/manage Defender XDR custom detection rules via Graph API | "create custom detection", "deploy detection", "detection rule", "custom detection", "deploy rule", "batch deploy" |
| π§ Tooling | kql-query-authoring | KQL query creation with schema validation and community examples | "write KQL", "create KQL query", "help with KQL", "query [table]" |
| π§ Tooling | mcp-usage-monitoring | MCP server usage audit (Graph/Sentinel/Azure MCP telemetry analysis) | "MCP usage", "MCP server monitoring", "MCP activity", "MCP audit", "who is using MCP" |
| π§ Tooling | mitre-coverage-report | MITRE ATT&CK coverage analysis (rule mapping, gaps, SOC Optimization alignment) | "MITRE coverage", "MITRE report", "ATT&CK coverage", "technique coverage", "coverage gaps", "MITRE score" |
| π§ Tooling | sentinel-ingestion-report | Sentinel ingestion analysis (volume, tiers, anomalies, rule health, cost optimization) | "ingestion report", "usage report", "data volume", "cost analysis", "table breakdown", "ingestion anomaly" |
Skill Detection Workflow
- Parse user request for trigger keywords from table above
- Getting started / exploratory requests: If the user asks "what can you do?", "where do I start?", "help me investigate", "how do I use this", "show me what you can do", "what's going on?", or any open-ended orientation question β recommend and offer to run the
threat-pulseskill as the starting point. Briefly explain it runs a 15-minute broad scan across 7 security domains and produces a prioritized dashboard with drill-down recommendations to specialized skills. Ask if they'd like to run it. - If match found: Read the skill file:
- Standard skills:
.github/skills/<skill-name>/SKILL.md - Subfolder skills (e.g., scope-drift-detection):
.github/skills/<parent-skill>/<sub-skill>/SKILL.md
- Standard skills:
- Follow skill-specific workflow (inherits global rules from this file)
- Future skills: Check
.github/skills/folder withlist_dirto discover new workflows
Skill files location: .github/skills/<skill-name>/SKILL.md or .github/skills/<parent-skill>/<sub-skill>/SKILL.md
Integration with MCP Servers
The investigation system integrates with these MCP servers (which Copilot has access to):
Microsoft Sentinel Data Lake MCP
Execute KQL queries and explore table schemas directly against your Sentinel workspace:
- mcp_sentinel-data_query_lake: Execute read-only KQL queries on Sentinel data lake tables. Best practices: filter on datetime first, use
takeorsummarizeoperators to limit results, prefer narrowly scoped queries with explicit filters - mcp_sentinel-data_search_tables: Discover table schemas using natural language queries. Returns table definitions to support query authoring
- mcp_sentinel-data_list_sentinel_workspaces: List all available Sentinel workspace name/ID pairs
- π΄
workspaceIdscope selection:- Single-table queries on lake-only system tables (
UnifiedAgentObservabilityfrom the Agent 365 / A365 Observability connector, etc.) β passworkspaceId: "default". Under a workspace GUID these returnSemanticError: Failed to resolve tableandsearch_tablesdoes not index them. Also not available via Advanced Hunting or Triage MCP. If a documented lake table fails to resolve, retry once with"default"before reporting it missing. - Cross-scope joins (system table β workspace table like
CloudAppEvents,IdentityInfo,_KQL_CL) β passworkspaceId: <workspace-guid>and reach into the lake viaworkspace("default").<Table>. The reverse (workspaceId: "default"+workspace("<guid>").<Table>) returnsWorkspaceNotAvailable. The MCP param accepts one ID only β comma/semicolon/space-separated lists returnKusto database name not found.
- Single-table queries on lake-only system tables (
- Documentation: https://learn.microsoft.com/en-us/azure/sentinel/datalake/
Microsoft Sentinel Triage MCP
Incident investigation and threat hunting tools for Defender XDR and Sentinel:
- Incident Management: List/get incidents (
ListIncidents,GetIncidentById), list/get alerts (ListAlerts,GetAlertByID)- β οΈ
ListAlertslimitation: This tool has NOincidentIdparameter. It only supportscreatedAfter,createdBefore,severity,status,skip,top. Calling it returns all tenant alerts up to the page size β any unsupported parameter is silently ignored. To get alerts for a specific incident, useGetIncidentByIdwithincludeAlertsData=true, or queryAlertInfo/AlertEvidenceviaRunAdvancedHuntingQuerywith entity-based filtering.
- β οΈ
- Advanced Hunting: Run KQL queries across Defender XDR tables and connected Log Analytics workspace tables (
RunAdvancedHuntingQuery), fetch table schemas (FetchAdvancedHuntingTablesOverview,FetchAdvancedHuntingTablesDetailedSchema)- β οΈ Parameter name: Use
kqlQuery, NOTquery(see Troubleshooting Guide).
- β οΈ Parameter name: Use
- Entity Investigation: File info/stats/alerts (
GetDefenderFileInfo,GetDefenderFileStatistics,GetDefenderFileAlerts), device details (GetDefenderMachine,GetDefenderMachineAlerts,GetDefenderMachineLoggedOnUsers), IP analysis (GetDefenderIpAlerts,GetDefenderIpStatistics), user activity (ListUserRelatedAlerts,ListUserRelatedMachines) - Vulnerability Management: List affected devices (
ListDefenderMachinesByVulnerability), software vulnerabilities (ListDefenderVulnerabilitiesBySoftware) - Remediation: List/get remediation tasks (
ListDefenderRemediationActivities,GetDefenderRemediationActivity) - When to Use: Incident triage, threat hunting over your own Defender/Sentinel data, correlating alerts/entities during investigations
- Documentation: https://learn.microsoft.com/en-us/azure/sentinel/datalake/sentinel-mcp-triage-tool
π Sentinel β Defender XDR ID Mapping β GLOBAL RULE
The Sentinel Triage MCP (GetIncidentById, GetAlertById, ListAlerts) uses Defender XDR IDs, NOT Sentinel table IDs. Passing Sentinel IDs to these tools returns "not found" errors.
| Sentinel Table Field | What It Is | Triage MCP Equivalent | How to Map |
|---|---|---|---|
SecurityIncident.IncidentNumber | Sentinel-assigned sequential number | β Not accepted by GetIncidentById | Use SecurityIncident.ProviderIncidentId instead β this is the Defender XDR incident ID |
SecurityIncident.ProviderIncidentId | Defender XDR incident ID | β
Pass this to GetIncidentById | Direct β no mapping needed |
SecurityAlert.SystemAlertId | Sentinel-assigned alert GUID | β Not accepted by GetAlertById | Extract IncidentId from SecurityAlert.ExtendedProperties for the Defender XDR ID |
When you discover incidents/alerts via Sentinel KQL (SecurityIncident, SecurityAlert tables) and need to drill down via Triage MCP:
- For incidents: Always
project ProviderIncidentIdin your Sentinel query and pass that value toGetIncidentById - For alerts: Extract the Defender ID from
ExtendedProperties:tostring(parse_json(ExtendedProperties).IncidentId)β or query the incident viaProviderIncidentIdfirst - Never pass
IncidentNumberorSystemAlertIdto Triage MCP tools
| Action | Status |
|---|---|
Passing SecurityIncident.IncidentNumber to GetIncidentById | β PROHIBITED |
Passing SecurityAlert.SystemAlertId to GetAlertById | β PROHIBITED |
Using ProviderIncidentId from SecurityIncident for Triage MCP calls | β REQUIRED |
Extracting Defender ID from ExtendedProperties.IncidentId for alert drill-down | β REQUIRED |
π SecurityIncident Query & Output Standards β GLOBAL RULE
These rules apply to ALL SecurityIncident queries, not just Triage MCP interactions.
Every SecurityIncident query MUST include ProviderIncidentId in the output and every incident presented to the user MUST include a clickable Defender XDR portal URL: https://security.microsoft.com/incidents/{ProviderIncidentId}?tid=<tenant_id> (read tenant_id from config.json; omit ?tid= if not configured).
| Action | Status |
|---|---|
Querying SecurityIncident without projecting ProviderIncidentId | β PROHIBITED |
| Presenting incidents to user without Defender XDR portal URL | β PROHIBITED |
Using IncidentNumber as the primary identifier in output | β PROHIBITED |
Including clickable https://security.microsoft.com/incidents/{ProviderIncidentId}?tid=<tenant_id> link | β REQUIRED |
π Tenant ID in Portal URLs β GLOBAL RULE
ALL security.microsoft.com URLs generated in output MUST include the tid query parameter for reliable cross-tenant deep linking. Read tenant_id from config.json.
| URL has existing query params? | Append |
|---|---|
| No query string | ?tid=<tenant_id> |
Has ? already | &tid=<tenant_id> |
If tenant_id is not configured (missing, empty, or placeholder YOUR_*): omit tid entirely.
This applies to: incident links, entity links (user, domain, IP, device, file hash), and AH portal links (https://security.microsoft.com/v2/advanced-hunting?tid=<tenant_id> β plain link, no encoded query). KQL strcat() patterns must substitute the tenant_id value at query time.
π΄ URL Hallucination β GLOBAL RULE
Only output a portal URL if it is documented in the active skill, a queries/ file, or this file β or built from such a template by substituting query-result IDs. Otherwise use a plain-text breadcrumb (e.g., Defender XDR β Settings β Indicators). Never construct portal URLs from memory.
π§ Tool Selection Rule: Data Lake vs Advanced Hunting
See Step 0 of the KQL pre-flight checklist for the lookback-based decision and timestamp adaptation. This section covers the remaining differences.
Key facts:
- The LA workspace is connected to the unified Defender portal. Advanced Hunting can query all tables in the workspace β XDR-native tables (Device*, Email*, etc.), Sentinel-native tables (SigninLogs, AuditLogs, LAQueryLogs, etc.), and custom tables (
*_CL). It is NOT limited to Defender XDR data only. - Custom Detection eligibility:
_CLtables are fully supported for Custom Detection rules, including NRT frequency. Examples:ABAPAuditLog_CL,Okta_CL,ProofPointTAPClicksPermitted_CL. See the detection-authoring skill for the complete NRT-supported table list. - ASIM parser functions (
_Im_NetworkSession,_Im_WebSession,_Im_Dns,_Im_ProcessEvent, etc.) and other workspace-level functions are fully supported in Advanced Hunting β they resolve against the connected LA workspace.mcp_sentinel-data_query_lakecannot resolve workspace-level functions and returnsUnknown functionerrors for_Im_*calls. UseRunAdvancedHuntingQueryfor ASIM parser queries.
| Factor | RunAdvancedHuntingQuery (Advanced Hunting) | mcp_sentinel-data_query_lake (Sentinel Data Lake) |
|---|---|---|
| Cost | Free for Analytics-tier tables (Defender license). Auxiliary/Basic-tier tables still incur query costs even when queried via AH. | Billed per query (Log Analytics costs) |
| Retention | 30 days (Graph API cap β silently truncates). | 90+ days (workspace-configured) |
| Safety filter | MCP-level filter may block queries with offensive-security keywords | No additional filter beyond KQL validation |
| Negation syntax | !has_any / !in~ may fail in let blocks β use not() wrappers | Standard KQL negation operators work reliably |
| Workspace functions | Supports ASIM parsers and workspace-level functions | Cannot resolve workspace-level functions |
Fallback triggers (switch AH β Data Lake):
- Lookback > 30 days (see Step 0)
- Query blocked by AH safety filter
- AH returns "table not found" (legacy tables, some custom tables)
Skill File Override Rule
When executing a skill workflow (from .github/skills/), the skill's tool specifications take precedence over the ad-hoc rule. Skills may choose a specific tool deliberately for retention requirements, safety-filter avoidance, or tested compatibility.
KQL Search MCP
GitHub-powered KQL query discovery and schema intelligence (331+ tables from Defender XDR, Sentinel, Azure Monitor):
- GitHub Query Discovery: Search GitHub for KQL examples (
search_github_examples_fallbackβ ), find repos (search_kql_repositoriesβ ), extract from files (get_kql_from_fileβ ). Note:search_favorite_reposhas a known bug (v1.0.5) - usesearch_github_examples_fallbackinstead. - Schema Intelligence: Get table schemas (
get_table_schema), search tables by description (search_tables), find columns (find_column), list categories (list_table_categories) - Query Generation & Validation: Generate validated KQL queries from natural language (
generate_kql_query), validate existing queries (validate_kql_query), get Microsoft Learn docs (get_query_documentation) - ASIM Schema Support: Search/validate/generate queries for 11 ASIM schemas (
search_asim_schemas,get_asim_schema_info,validate_asim_parser,generate_asim_query_template) - When to Use: Writing new KQL queries, finding query examples from community repos (Azure-Sentinel, Microsoft-365-Defender-Hunting-Queries), validating query syntax before execution, understanding table schemas
- Documentation: https://www.npmjs.com/package/kql-search-mcp
Microsoft Learn MCP
Official Microsoft/Azure documentation search and code samples:
- microsoft_docs_search: Semantic search across Microsoft Learn documentation (returns up to 10 high-quality content chunks with title, URL, excerpt)
- microsoft_docs_fetch: Fetch complete Microsoft Learn pages in markdown format (use after search when you need full tutorials, troubleshooting guides, or complete documentation)
- microsoft_code_sample_search: Search official Microsoft/Azure code samples (up to 20 relevant code snippets with optional
languagefilter: csharp, javascript, typescript, python, powershell, azurecli, sql, java, kusto, etc.) - When to Use: Grounding answers in official Microsoft knowledge, finding latest Azure/Microsoft 365/Security documentation, getting official code examples for Microsoft technologies, verifying API usage patterns
- Workflow: Use
microsoft_docs_searchfirst for breadth βmicrosoft_code_sample_searchfor practical examples βmicrosoft_docs_fetchfor depth when needed - Documentation: https://learn.microsoft.com/en-us/training/support/mcp-get-started
Microsoft Graph MCP
Azure AD and Microsoft 365 API integration:
- mcp_microsoft_mcp_microsoft_graph_suggest_queries: Find Graph API endpoints using natural language intent descriptions
- mcp_microsoft_mcp_microsoft_graph_get: Execute Graph API calls (MUST call suggest_queries first to get correct endpoints)
- mcp_microsoft_mcp_microsoft_graph_list_properties: Explore entity schemas when RAG examples are insufficient
- Critical Workflow: ALWAYS call
suggest_queriesbeforeget- never construct URLs from memory. Resolve template variables before making final API calls - Documentation: Built-in Graph MCP integration
Sentinel Heatmap MCP (Custom Visualization)
Interactive heatmap visualization for Sentinel security data, rendered inline in VS Code chat:
- mcp_sentinel-heat_show-signin-heatmap: Display aggregated data as an interactive heatmap with optional threat intel drill-down
- Location:
mcp-apps/sentinel-heatmap-server/(local TypeScript/React MCP App) - Data format: Array of
{row, column, value}objects. Optional:title,rowLabel,colLabel,valueLabel,colorScale(green-red/blue-red/blue-yellow),enrichment(IP threat intel for drill-down panels) - When to Use: Visualizing attack patterns by IP and time, sign-in activity by app/hour, failed auth by location/day
- See:
heatmap-visualizationskill for full usage guidance, KQL query patterns, and enrichment schema
Azure MCP Server
Direct Azure Resource Manager and Azure Monitor integration for quick ad-hoc queries:
mcp_azure-mcp-ser_monitorβmonitor_workspace_log_query: Execute KQL queries directly against Log Analytics workspace via Azure Monitor API. Same data as Sentinel Data Lake, but through the ARM path β useful for fast ad-hoc queries within the 90-day retention window.mcp_azure-mcp-ser_monitorβmonitor_activitylog_list: Get Azure Activity Logs for specific resources (deployments, modifications, access patterns)mcp_azure-mcp-ser_group_list: List resource groups in a subscriptionmcp_azure-mcp-ser_subscription_list: List subscriptions
Required parameters: Read tenant, subscription, resource-group, and workspace from config.json (azure_mcp section). See Environment Configuration for field mapping.
When to use Azure MCP Server workspace_log_query vs Sentinel Data Lake query_lake:
| Factor | Azure MCP workspace_log_query | Sentinel Data Lake query_lake |
|---|---|---|
| Speed | Faster for ad-hoc (direct ARM call) | 5-15 min ingestion lag for very recent data |
| Auth | DefaultAzureCredential (VS Code cached) | Sentinel Platform Services OAuth |
| Params | Needs resource-group + workspace name + table | Needs workspaceId (GUID) |
| Retention | 90 days | 90 days (same workspace) |
| Best for | Quick lookups, AzureActivity, ad-hoc exploration | Skill-based investigation workflows |
Azure MCP telemetry detection: See
mcp-usage-monitoringskill (Queries 25-27) for Azure MCP Server fingerprinting and usage analysis.
Sentinel Exposure Graph MCP
Attack surface analysis tools for the Microsoft Security Exposure Management graph. More effective than raw KQL for per-asset attack path scenarios β use these first, fall back to KQL for fleet-wide sweeps.
β οΈ Preview: The Sentinel Exposure Graph MCP server is in preview and may not be available in all environments. If the tools are not present (tool calls fail or are not listed), fall back to KQL queries against
ExposureGraphNodes/ExposureGraphEdgesin Advanced Hunting. Seequeries/cloud/exposure_graph_attack_paths.mdfor equivalent KQL patterns.
mcp_sentinel-grap_graph_find_blastradius: All downstream targets reachable from a source asset. Params:sourceNamemcp_sentinel-grap_graph_exposure_perimeter: Inbound perimeter β externally-reachable nodes with walkable paths TO a target. Params:targetName- Known limitation: May return empty for assets that ARE network-reachable but lack formal ExposureGraph perimeter classification. Fall back to KQL edge analysis with
EdgeLabel == "routes traffic to"when empty.
- Known limitation: May return empty for assets that ARE network-reachable but lack formal ExposureGraph perimeter classification. Fall back to KQL edge analysis with
mcp_sentinel-grap_graph_find_walkable_paths: Full path between source and target with RBAC role detail,isOverProvisionedandisIdentityInactiveflags. Params:sourceName,targetNamemcp_sentinel-grap_graph_find_connected_nodes: All nodes of a specific type within N hops. Params:sourceName,sourceNodeLabel,targetNodeLabel,maxHops(1β3 recommended; higher = very large results)mcp_sentinel-grap_graph_get_context: Full graph schema (node/edge types). Params:GraphName(alwaysSystemScenarioEKGGraph)
Workflow: blast radius β exposure perimeter β walkable paths for specific targets β connected nodes by type β KQL for fleet-wide analysis
- When to Use: Investigating compromised assets, mapping blast radius after incidents, validating attack paths, assessing critical asset exposure, identifying over-provisioned identities along permission chains
- When to Use KQL Instead: Fleet-wide sweeps, cookie chain analysis across all devices, choke point detection, permission role distribution across all paths, custom multi-join aggregations
- Full documentation: See
queries/cloud/exposure_graph_attack_paths.mdfor detailed tool docs, parameters, examples, and 32 KQL queries
π Resource Discovery β Cross-Subscription Lookup Pattern
config.json only contains the primary Sentinel workspace subscription. Resources investigated via Defender XDR (DeviceInfo, ExposureGraphNodes, alerts) often reside in different subscriptions. When you need to look up an Azure resource (VM, NSG, NIC, etc.) discovered through investigation queries:
- Try
config.jsonsubscription first βaz vm list --query "[?contains(name, '<name>')]" --subscription "<config.json sub>" - If not found β
az account listto enumerate all subscriptions - Search each subscription until found
- Use the discovered subscription + resource group for all subsequent ARM calls (NSG, NIC, subnet, etc.)
Why this matters: The Defender XDR unified portal aggregates devices across ALL connected subscriptions. A device flagged in ExposureGraphNodes or DeviceInfo may live in any subscription β the config.json subscription is only guaranteed to contain the Log Analytics workspace. Always specify --subscription explicitly in Azure CLI calls to avoid defaulting to the wrong subscription context.
Custom Sentinel Tables
Signinlogs_Anomalies_KQL_CL
Purpose: Pre-computed sign-in anomaly detection table populated by hourly KQL job. Tracks new IPs and device combinations against 90-day baseline.
- Anomaly Types:
NewInteractiveIP,NewInteractiveDeviceCombo,NewNonInteractiveIP,NewNonInteractiveDeviceCombo - Detection Model: Compares last 1 hour activity against 90-day baseline; severity scored by artifact hit frequency + geographic novelty (
CountryNovelty,CityNovelty,StateNovelty) - Key Columns:
DetectedDateTime,UserPrincipalName,AnomalyType,Value,Severity,ArtifactHits,BaselineSize, geographic novelty flags,Baseline*Listarrays - When to Use: Rapid anomaly triage during user investigations, impossible travel detection, token theft indicators (non-interactive anomalies with geo changes)
Full Documentation: See docs/Signinlogs_Anomalies_KQL_CL.md for complete schema, example queries, and severity thresholds.
APPENDIX: Ad-Hoc Query Examples
SecurityAlert.Status Is Immutable - Always Join SecurityIncident
β οΈ CRITICAL: The Status field on the SecurityAlert table is set to "New" at creation time and never changes. It does NOT reflect whether the alert has been investigated, closed, or classified.
To get the actual investigation status, you MUST join with SecurityIncident:
let relevantAlerts = SecurityAlert
| where TimeGenerated between (start .. end)
| where Entities has '<ENTITY>'
| summarize arg_max(TimeGenerated, *) by SystemAlertId
| project SystemAlertId, AlertName, AlertSeverity, ProviderName, Tactics;
SecurityIncident
| where CreatedTime between (start .. end)
| summarize arg_max(TimeGenerated, *) by IncidentNumber
| mv-expand AlertId = AlertIds
| extend AlertId = tostring(AlertId)
| join kind=inner relevantAlerts on $left.AlertId == $right.SystemAlertId
| summarize Title = any(Title), Severity = any(Severity), Status = any(Status),
Classification = any(Classification), CreatedTime = any(CreatedTime)
by ProviderIncidentId
| extend PortalUrl = strcat("https://security.microsoft.com/incidents/", ProviderIncidentId, "?tid=<TENANT_ID>")
| order by CreatedTime desc
Output rule: When presenting these results to the user, always render
PortalUrlas a clickable markdown link:[#{ProviderIncidentId}]({PortalUrl}). See SecurityIncident Query & Output Standards.
| Field | Source | Meaning |
|---|---|---|
SecurityAlert.Status | Alert table | Immutable creation status - always "New" |
SecurityIncident.Status | Incident table | Real status - New/Active/Closed |
SecurityIncident.Classification | Incident table | Closure reason - TruePositive/FalsePositive/BenignPositive |
Reference: See queries/incidents/security_incident_analysis.md for the canonical SecurityAlertβSecurityIncident join pattern.
Queries Library β Standardized Format (queries/)
All query files in queries/ MUST use this standardized metadata header for efficient grep_search discovery:
Folder structure: Query files are organized into subfolders by data domain:
| Subfolder | Domain | Examples |
|---|---|---|
queries/identity/ | Entra ID / Azure AD | app_credential_management.md, service_principal_scope_drift.md |
queries/endpoint/ | Defender for Endpoint | rare_process_chains.md, infostealer_hunting_campaign.md |
queries/email/ | Defender for Office 365 | email_threat_detection.md |
queries/network/ | Network telemetry | network_anomaly_detection.md |
queries/cloud/ | Cloud apps & exposure | cloudappevents_exploration.md, exposure_graph_attack_paths.md |
File naming convention: {topic}.md β lowercase, underscores, no redundant suffixes like _queries or _sentinel. Keep names short and descriptive of the detection scenario or data domain. Place new files in the subfolder matching their primary data source table.
# <Title>
**Created:** YYYY-MM-DD
**Platform:** Microsoft Sentinel | Microsoft Defender XDR | Both
**Tables:** <comma-separated list of exact KQL table names>
**Keywords:** <comma-separated searchable terms β attack techniques, scenarios, field names>
**MITRE:** <comma-separated technique IDs, e.g., T1021.001, TA0008>
**Domains:** <comma-separated threat-pulse domain tags β see Discovery Manifest below>
**Timeframe:** Last N days (configurable)
Required fields for search efficiency:
| Field | Purpose | Example |
|---|---|---|
Tables: | Exact KQL table names for grep_search by table | AuditLogs, SecurityAlert, SecurityIncident |
Keywords: | Searchable terms covering attack scenarios, operations, field names | credential, secret, certificate, persistence, app registration |
MITRE: | ATT&CK technique and tactic IDs | T1098.001, T1136.003, TA0003 |
Domains: | Threat-pulse domain tags for manifest-based cross-referencing | identity, email |
Valid domain tags: incidents, identity, spn, endpoint, email, admin, cloud, exposure
Search pattern: grep_search scoped to queries/** with the table name or keyword will hit the metadata header and locate the right file instantly.
When creating new query files: Follow this format. When updating existing files that lack these fields, add them.
Discovery Manifest (.github/manifests/)
The discovery manifest indexes all query files and skills with their domain tags, enabling deterministic cross-referencing by threat-pulse and other skills.
Two variants are generated:
discovery-manifest.yaml(default) β title, path, domains, mitre, prompt only. ~500 lines. Threat-pulse loads this one to minimize context consumption.discovery-manifest-full.yaml(verbose,--fullflag) β all fields (tables, keywords, mitre, domains, platform, timeframe). ~1300 lines.
How it works:
- Query files declare
**Domains:**in their metadata header - Skills declare
threat_pulse_domains:anddrill_down_prompt:in their YAML frontmatter python .github/manifests/build_manifest.pyscans everything and emits both manifests to.github/manifests/- The validator flags missing fields β missing
Domains:on a query file is an error
When to regenerate: Run python .github/manifests/build_manifest.py after:
- Creating or renaming a query file or skill
- Changing
Domains:,threat_pulse_domains:, ordrill_down_prompt:values - Adding new domain tags (update
VALID_DOMAINSinbuild_manifest.pyfirst)
When to regenerate TOCs: Run python scripts/generate_tocs.py after creating or updating a query file. The script auto-generates a ## Quick Reference β Query Index table with clickable anchor links for every query heading that has a KQL code block. It is idempotent β strips and regenerates existing TOCs on re-run.
| Action | Status |
|---|---|
Creating a query file without **Domains:** | β PROHIBITED |
Creating an investigation skill without threat_pulse_domains: | β PROHIBITED |
Forgetting to run build_manifest.py after adding files | β PROHIBITED |
Forgetting to run generate_tocs.py after adding/updating query files | β PROHIBITED |
π΄ REQUIRED: cd-metadata blocks for ALL queries in queries/
Every query in a queries/ file MUST include a <!-- cd-metadata --> HTML comment block immediately before the KQL code block β either cd_ready: true with full fields, or cd_ready: false with adaptation_notes explaining why. Read the CD Metadata Contract in .github/skills/detection-authoring/SKILL.md for the full schema, valid field values, and examples.
| Action | Status |
|---|---|
Creating a query file in queries/ without cd-metadata blocks | β PROHIBITED |
PII-Free Standard: All committed documents β query files (queries/), skill files (.github/skills/), and any other versioned documentation β must NEVER contain tenant-specific PII such as real workspace names, UPNs, server hostnames, subscription/tenant GUIDs, or application names from live environments. Use generic placeholders (e.g., <YourAppName>, user@contoso.com, <WorkspaceName>, la-yourworkspace). Before creating or updating any skill or query file, perform a PII sanity check: scan the content for real identifiers that may have been copied from live investigation output or config files, and replace them with placeholders.
IP Enrichment Utility (enrich_ips.py)
Use enrich_ips.py to enrich IP addresses with 3rd-party threat intelligence from ipinfo.io, vpnapi.io, AbuseIPDB, and Shodan. This provides VPN/proxy/Tor detection, ISP/ASN details, hosting provider identification, abuse confidence scores, recent community-reported attack activity, open port enumeration, service/banner detection, known CVEs, and Shodan tags (e.g., honeypot, C2, self-signed).
When to use:
- Whenever the user asks to enrich, investigate, or check IPs
- When risky sign-ins, anomalies, or suspicious activity involve unfamiliar IP addresses
- During ad-hoc investigations, follow-up analysis, or spot-checking suspicious IPs
- Any time IP context would improve the investigation (e.g., confirming VPN usage, checking abuse history)
When NOT to use:
- When already executing a prescriptive skill workflow (from
.github/skills/) that has its own built-in IP enrichment step β follow the skill's guidance instead to avoid duplication
# Enrich specific IPs
python enrich_ips.py 203.0.113.42 198.51.100.10 192.0.2.1
# Enrich all unenriched IPs from an investigation file
python enrich_ips.py --file temp/investigation_user_20251130.json
Output: Detailed per-IP results (city, country, ISP/ASN, VPN/proxy/Tor flags, AbuseIPDB score + recent report comments) and a JSON export saved to temp/.
β οΈ Output Consumption: NEVER read_file the .txt report during long conversations β it triggers VS Code chat freezes from cumulative response size. Parse the .json via PowerShell (ConvertFrom-Json | Format-Table) instead.
AH Portal Links β "Run in Advanced Hunting"
Every AH query in a π¬ Take Action block MUST include both:
- The KQL in a copyable fenced code block (
```kql ... ```) β the analyst copies this to paste into the AH portal - A plain portal link immediately after the code block:
[Run in Advanced Hunting](https://security.microsoft.com/v2/advanced-hunting?tid=<tenant_id>)β opens the AH page scoped to the correct tenant; the analyst pastes the KQL there
Tenant ID: Read tenant_id from config.json and append ?tid=<tenant_id> to the URL. Omit tid entirely if tenant_id is missing or a placeholder.
π΄ DO NOT encode KQL into the URL. The scripts/kql_to_ah_url.py script still exists but is deprecated for use in output β encoded URLs are fragile (encoding bugs, VS Code chat rendering quirks, link-length limits). Always provide the plain portal URL + copyable code block instead.
| Action | Status |
|---|---|
| AH query in Take Action without a copyable KQL code block | β PROHIBITED |
AH query in Take Action without a plain Run in Advanced Hunting portal link | β PROHIBITED |
Generating gzip/base64-encoded AH deep links via kql_to_ah_url.py for output | β PROHIBITED |
Every AH query in Take Action includes BOTH a code block AND a plain ?tid=<tenant_id> portal link | β REQUIRED |
Enumerating User Permissions and Roles
When asked to check permissions or roles for a user account, ALWAYS query BOTH:
- Permanent Role Assignments (active roles)
- PIM-Eligible Roles (roles that can be activated on-demand)
Step 1: Get User Object ID
/v1.0/users/<UPN>?$select=id
Step 2: Get Permanent Role Assignments
/v1.0/roleManagement/directory/roleAssignments?$select=principalId&$filter=principalId eq '<USER_ID>'&$expand=roleDefinition($select=templateId,displayName,description)
Step 3: Get PIM-Eligible Roles
/v1.0/roleManagement/directory/roleEligibilityScheduleInstances?$select=memberType,startDateTime,endDateTime&$filter=principalId eq '<USER_ID>'&$expand=principal($select=id),roleDefinition($select=id,displayName,description)
Step 4: Get Active PIM Role Assignments (time-bounded)
/v1.0/roleManagement/directory/roleAssignmentScheduleInstances?$select=assignmentType,memberType,startDateTime,endDateTime&$filter=principalId eq '<USER_ID>' and startDateTime le <CURRENT_DATETIME> and endDateTime ge <CURRENT_DATETIME>&$expand=principal($select=id),roleDefinition($select=id,displayName,description)
Example Output Format:
Total Role Inventory for <USER>:
Permanent Active Roles (X):
1. Global Administrator
2. Security Administrator
...
PIM-Eligible Roles (Y):
1. Exchange Administrator (Eligible since: <date>, Expiration: <date or β>)
2. Intune Administrator (Eligible since: <date>, Expiration: <date or β>)
...
Active PIM Role Assignments (Z):
1. [Role Name] (Activated: <start>, Expires: <end>, Assignment Type: <type>)
...
Security Analysis Guidance:
- Flag if high-privilege roles (Global Admin, Security Admin, Application Admin) are permanently assigned instead of PIM-eligible
- Recommend converting permanent privileged roles to PIM-eligible with approval workflows
- Note if PIM eligibilities have no expiration (should be reviewed periodically)
Troubleshooting Guide
Common Issues and Solutions
| Issue | Solution |
|---|---|
| Graph API returns 404 for entity | Verify UPN/ID is correct; check if entity exists with different identifier |
| Sentinel query timeout | Reduce date range or add ` |
RunAdvancedHuntingQuery returns "An error occurred invoking" | Wrong parameter name β use kqlQuery, NOT query. This is the #1 silent failure mode for AH calls. |
| KQL syntax error | Validate query with validate_kql_query tool before execution |
| SemanticError: Failed to resolve column | Field doesn't exist in table schema - use get_table_schema to check valid columns |
| SemanticError: Failed to resolve table | Table not in Data Lake - try RunAdvancedHuntingQuery instead |
| Dynamic field errors (DeviceDetail, LocationDetails) | Use tostring() wrapper or parse_json() to extract values |
| Risky sign-ins query fails | Must use /beta endpoint, not /v1.0 |
| Multiple workspaces available | Follow SENTINEL WORKSPACE SELECTION rule - ask user to choose |