Telemetry Events Reference
May 6, 2026 · View on GitHub
This document provides a comprehensive reference of all telemetry events emitted by the Lemon agent runtime.
Overview
Lemon uses Erlang's :telemetry library for observability. Events are emitted at key points in the system to enable monitoring, debugging, and performance analysis.
Introspection Storage Contract (M1)
Telemetry events can be normalized into a canonical introspection envelope via LemonCore.Introspection and persisted through LemonCore.Store.
Canonical fields:
event_id- unique event identifierevent_type- taxonomy event namets_ms- timestamp in millisecondsrun_id,session_key,agent_id,parent_run_id,engineprovenance-:direct | :inferred | :unavailablepayload- redacted event payload
APIs:
LemonCore.Introspection.record/3- build + persist canonical eventsLemonCore.Introspection.list/1- list events with filters (run_id,session_key,agent_id,event_type,since_ms,until_ms,limit)LemonCore.Store.append_introspection_event/1- low-level appendLemonCore.Store.list_introspection_events/1- low-level filtered query
Retention:
LemonCore.Storeapplies periodic retention sweep for:introspection_log(default: 7 days).
Introspection Event Taxonomy
All event atoms passed as the first argument to LemonCore.Introspection.record/3. Grouped by the component that emits them.
Lemon-Native Events (M2) — engine: "lemon", provenance: :direct
RunProcess (LemonRouter.RunProcess)
| Event Type | When |
|---|---|
:run_started | Run process initialised |
:run_completed | Run finished (ok or error) |
:run_failed | Run process terminated abnormally |
RunOrchestrator (LemonRouter.RunOrchestrator)
| Event Type | When |
|---|---|
:orchestration_started | Submit received, run_id generated |
:orchestration_resolved | Engine and model resolved, run process started |
:orchestration_failed | Engine/model resolution failed |
ThreadWorker (LemonGateway.ThreadWorker)
| Event Type | When |
|---|---|
:thread_started | Thread worker initialised |
:thread_message_dispatched | Job enqueued to thread |
:thread_terminated | Thread worker stopped |
Scheduler (LemonGateway.Scheduler)
| Event Type | When |
|---|---|
:scheduled_job_triggered | Job submitted to in-flight pool |
:scheduled_job_completed | Slot released after job completion |
Session (CodingAgent.Session)
| Event Type | When |
|---|---|
:session_started | Session GenServer initialised |
:session_ended | Session terminated |
:compaction_triggered | Context compaction applied |
EventHandler (CodingAgent.Session.EventHandler)
| Event Type | When |
|---|---|
:tool_call_dispatched | Tool call start event observed |
Agent Loop Events (M3) — AgentCore.Agent
| Event Type | Provenance | When |
|---|---|---|
:agent_loop_started | :direct | Agent loop begins |
:agent_turn_observed | :inferred | Each agent turn completes (streaming end) |
:agent_loop_ended | :direct | Agent loop finishes (idle transition) |
JSONL Runner Events (M3) — AgentCore.CliRunners.JsonlRunner
| Event Type | Provenance | When |
|---|---|---|
:jsonl_stream_started | :direct | CLI subprocess stream begins |
:tool_use_observed | :inferred | Tool call detected in engine output |
:assistant_turn_observed | :inferred | Assistant text turn detected |
:jsonl_stream_ended | :direct | CLI subprocess stream ends |
CLI Runner Engine Events (M3) — provenance: :inferred
Emitted by each engine-specific runner (Codex, Claude, Droid, Kimi, OpenCode, Pi).
| Event Type | Engines | When |
|---|---|---|
:engine_subprocess_started | codex, claude, kimi, opencode, pi | Engine session/subprocess initialised |
:engine_output_observed | codex, kimi, opencode, pi | Engine produces a final answer or output |
:engine_subprocess_exited | codex, claude, kimi, opencode, pi | Engine subprocess exits with error |
Attaching Handlers
# Attach a handler for all agent_core events
:telemetry.attach_many(
"my-handler",
[
[:agent_core, :loop, :start],
[:agent_core, :loop, :end],
[:agent_core, :tool_task, :start],
[:agent_core, :tool_task, :end],
[:agent_core, :tool_result, :emit]
],
&MyModule.handle_event/4,
%{my_config: true}
)
# Handler function
def handle_event(event, measurements, metadata, config) do
IO.inspect({event, measurements, metadata}, label: "Telemetry")
end
AgentCore Events
Agent Loop Events
[:agent_core, :loop, :start]
Emitted when an agent loop begins execution.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
system_time | integer | System time in native units |
| Metadata | ||
loop_type | :main | :continue | Type of loop being started |
[:agent_core, :loop, :end]
Emitted when an agent loop completes.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
duration | integer | Duration in native time units |
| Metadata | ||
loop_type | :main | :continue | Type of loop that completed |
reason | atom | Completion reason |
[:agent_core, :loop, :exception]
Emitted when an agent loop encounters an exception.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
system_time | integer | When the exception occurred |
| Metadata | ||
kind | atom | Exception kind (:error, :throw, :exit) |
reason | term | Exception reason/value |
[:agent_core, :loop, :task_start_failed]
Emitted when the loop task fails to start.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
system_time | integer | When the failure occurred |
| Metadata | ||
reason | term | Failure reason |
Tool Task Events
[:agent_core, :tool_task, :start]
Emitted when a tool task begins execution.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
system_time | integer | Start time |
| Metadata | ||
tool_name | string | Name of the tool |
tool_call_id | string | Unique tool call identifier |
[:agent_core, :tool_task, :end]
Emitted when a tool task completes successfully.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
duration | integer | Execution duration |
| Metadata | ||
tool_name | string | Name of the tool |
tool_call_id | string | Unique tool call identifier |
is_error | boolean | Whether result was an error |
[:agent_core, :tool_task, :error]
Emitted when a tool task fails or is aborted.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
system_time | integer | Error time |
duration | integer | Time until error |
| Metadata | ||
tool_name | string | Name of the tool |
tool_call_id | string | Unique tool call identifier |
reason | term | Error reason |
Tool Result Events
[:agent_core, :tool_result, :emit]
Emitted when a tool result message is appended to context.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
system_time | integer | Emit time (native units) |
| Metadata | ||
tool_name | string | Name of the tool |
tool_call_id | string | Unique tool call identifier |
is_error | boolean | Whether the result is an error |
trust | :trusted | :untrusted | Normalized trust level for the emitted tool result |
trust comes from tool result trust normalization in the tool-call loop. Only :untrusted is emitted as untrusted; all other values are emitted as :trusted.
LemonSkills Events
These events record skill lifecycle decisions with redacted metadata. Tool events include tool_call_id so operators can correlate them with [:agent_core, :tool_task, ...] and [:agent_core, :tool_result, :emit] events for the same tool invocation. Prompt-render events record only surfaced skill keys/counts and never include skill bodies. When the event is emitted through CodingAgent prompt/tool paths, metadata includes session_key, session_id, and agent_id.
LemonSkills also attaches an introspection bridge on application start. The bridge persists [:lemon_skills, :skill, :load] as :skill_load_observed, [:lemon_skills, :skill, :write] as :skill_write_observed, and [:lemon_skills, :skill, :prompt_render] as :skill_prompt_render_observed, lifting run_id, session_key, and agent_id into the canonical introspection envelope when those fields are present. The same bridge updates LemonSkills.Usage sidecars for load/write counters and curation metadata.
Skill Load Events
[:lemon_skills, :skill, :load]
Emitted when read_skill returns a skill or a not-found result.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
count | integer | Always 1 |
system_time | integer | Emit time in native units |
| Metadata | ||
result | string | ok or not_found |
key | string | Requested skill key |
name | string | Skill display name when found |
source | string | Skill source when found |
path | string | Skill directory when found |
view | string | Requested read_skill view |
section | string | Requested section, when present |
file_path | string | Requested file path, when present |
include_status | boolean | Whether status output was requested |
include_manifest | boolean | Whether manifest output was requested |
tool_call_id | string | Tool invocation identifier |
run_id | string | Run identifier, when provided |
session_key | string | CodingAgent session key, when provided |
session_id | string | CodingAgent session id, when provided |
agent_id | string | Agent id, when provided |
cwd | string | Project working directory, when present |
Skill Prompt Render Events
[:lemon_skills, :skill, :prompt_render]
Emitted when prompt composition renders an available-skills or relevant-skills block. The event intentionally excludes skill bodies and file contents.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
count | integer | Always 1 |
system_time | integer | Emit time in native units |
| Metadata | ||
surface | string | available or relevant |
skill_count | integer | Number of skills rendered |
skill_keys | list | Rendered skill keys |
active_count | integer | Rendered active skills |
not_ready_count | integer | Rendered not-ready skills |
missing_count | integer | Rendered skills with missing requirements |
run_id | string | Run identifier, when provided |
session_key | string | CodingAgent session key, when provided |
session_id | string | CodingAgent session id, when provided |
agent_id | string | Agent id, when provided |
cwd | string | Project working directory, when present |
Skill Write Events
[:lemon_skills, :skill, :write]
Emitted when skill_manage accepts or rejects a skill write operation. The event intentionally excludes skill body content, replacement strings, and supporting-file content.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
count | integer | Always 1 |
system_time | integer | Emit time in native units |
| Metadata | ||
result | string | ok or error |
action | string | create, edit, patch, delete, write_file, or remove_file |
name | string | Skill key requested by the agent |
scope | string | project or global |
path | string | Skill directory after a successful write |
file_path | string | Relative supporting file path, when applicable |
audit_status | string | Audit verdict for successful audited writes |
replacements | integer | Number of patch replacements, when applicable |
reason | string | Truncated rejection reason for failed writes |
tool_call_id | string | Tool invocation identifier |
run_id | string | Run identifier, when provided |
session_key | string | CodingAgent session key, when provided |
session_id | string | CodingAgent session id, when provided |
agent_id | string | Agent id, when provided |
cwd | string | Project working directory, when present |
Introspection Projection
| Event Type | Source Telemetry | Payload |
|---|---|---|
:skill_load_observed | [:lemon_skills, :skill, :load] | Same redacted metadata except envelope fields (run_id, session_key, session_id, agent_id) |
:skill_write_observed | [:lemon_skills, :skill, :write] | Same redacted metadata except envelope fields (run_id, session_key, session_id, agent_id) |
:skill_prompt_render_observed | [:lemon_skills, :skill, :prompt_render] | Same redacted metadata except envelope fields (run_id, session_key, session_id, agent_id) |
:missed_skill_observed | Session end audit | missed_skill_keys and loaded_skill_keys when the prompt surfaced <relevant-skills> but the agent did not load them |
Context Events
[:agent_core, :context, :size]
Emitted when context size is measured.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
char_count | integer | Total characters in context |
message_count | integer | Number of messages |
| Metadata | ||
has_system_prompt | boolean | Whether system prompt was included |
[:agent_core, :context, :warning]
Emitted when context exceeds warning threshold.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
char_count | integer | Current character count |
threshold | integer | Threshold that was exceeded |
| Metadata | ||
level | :warning | :critical | Severity level |
[:agent_core, :context, :truncated]
Emitted when context is truncated.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
dropped_count | integer | Messages dropped |
remaining_count | integer | Messages remaining |
| Metadata | ||
strategy | atom | Truncation strategy used |
EventStream Events
[:agent_core, :event_stream, :queue_depth]
Emitted on each push to track queue depth.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
depth | integer | Current queue depth |
| Metadata | ||
stream_id | reference | Stream identifier |
[:agent_core, :event_stream, :dropped]
Emitted when events are dropped due to backpressure.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
count | integer | Number of events dropped |
| Metadata | ||
stream_id | reference | Stream identifier |
strategy | atom | Drop strategy (:drop_oldest, :drop_newest) |
Subagent Events
[:agent_core, :subagent, :spawn]
Emitted when a subagent is spawned.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
system_time | integer | Spawn time |
| Metadata | ||
subagent_id | term | Subagent identifier |
parent_session | string | Parent session ID |
[:agent_core, :subagent, :end]
Emitted when a subagent completes.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
duration | integer | Total execution time |
| Metadata | ||
subagent_id | term | Subagent identifier |
reason | atom | Completion reason |
Agent Events
[:agent_core, :agent, :loop_error]
Emitted when the agent loop encounters an error.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
system_time | integer | Error time |
| Metadata | ||
error | term | Error details |
CodingAgent Events
Session Events
[:coding_agent, :session, :event_stream, :broadcast]
Emitted when events are broadcast to subscribers.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
subscriber_count | integer | Total subscribers |
direct_count | integer | Direct (legacy) subscribers |
stream_count | integer | Stream subscribers |
| Metadata | ||
session_id | string | Session identifier |
[:coding_agent, :session, :error]
Emitted when a session encounters an error.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
system_time | integer | Error time |
| Metadata | ||
session_id | string | Session identifier |
error | term | Error details |
AI Provider Events
Dispatcher Events
[:ai, :dispatcher, :queue_depth]
Emitted to track dispatcher queue depth.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
depth | integer | Current queue depth |
| Metadata | ||
provider | atom | Provider name |
[:ai, :dispatcher, :rejected]
Emitted when a request is rejected (rate limit or circuit open).
| Field | Type | Description |
|---|---|---|
| Measurements | ||
system_time | integer | Rejection time |
| Metadata | ||
provider | atom | Provider name |
reason | atom | :rate_limited | :circuit_open |
[:ai, :dispatcher, :retry]
Emitted when a request is being retried.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
attempt | integer | Current attempt number |
delay | integer | Delay before retry (ms) |
| Metadata | ||
provider | atom | Provider name |
error_type | atom | Type of error that caused retry |
Stream Events
[:ai, :stream, :error]
Emitted when a streaming error occurs.
| Field | Type | Description |
|---|---|---|
| Measurements | ||
system_time | integer | Error time |
| Metadata | ||
provider | atom | Provider that failed |
error | term | Error details |
Example: Monitoring Dashboard
defmodule MyApp.TelemetryHandler do
require Logger
def setup do
events = [
[:agent_core, :loop, :start],
[:agent_core, :loop, :end],
[:agent_core, :tool_task, :start],
[:agent_core, :tool_task, :end],
[:agent_core, :tool_result, :emit],
[:agent_core, :tool_task, :error],
[:agent_core, :context, :warning],
[:coding_agent, :session, :event_stream, :broadcast],
[:ai, :dispatcher, :rejected]
]
:telemetry.attach_many("my-app-handler", events, &handle_event/4, nil)
end
def handle_event([:agent_core, :loop, :start], _measurements, metadata, _config) do
Logger.info("Loop started: #{inspect(metadata.loop_type)}")
end
def handle_event([:agent_core, :loop, :end], measurements, metadata, _config) do
duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
Logger.info("Loop completed in #{duration_ms}ms: #{inspect(metadata.reason)}")
end
def handle_event([:agent_core, :tool_task, :end], measurements, metadata, _config) do
duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
Logger.info("Tool #{metadata.tool_name} completed in #{duration_ms}ms")
end
def handle_event([:agent_core, :tool_task, :error], _measurements, metadata, _config) do
Logger.error("Tool #{metadata.tool_name} failed: #{inspect(metadata.reason)}")
end
def handle_event([:agent_core, :tool_result, :emit], _measurements, metadata, _config) do
Logger.info(
"Tool result #{metadata.tool_name} emitted " <>
"(trust=#{metadata.trust}, error=#{metadata.is_error})"
)
end
def handle_event([:agent_core, :context, :warning], measurements, metadata, _config) do
Logger.warning(
"Context #{metadata.level}: #{measurements.char_count} chars " <>
"(threshold: #{measurements.threshold})"
)
end
def handle_event([:ai, :dispatcher, :rejected], _measurements, metadata, _config) do
Logger.warning("Request rejected: #{metadata.provider} - #{metadata.reason}")
end
def handle_event(event, measurements, metadata, _config) do
Logger.debug("Telemetry: #{inspect(event)} #{inspect(measurements)} #{inspect(metadata)}")
end
end
Using with :telemetry_poller
The AgentCore.TelemetryPoller emits periodic VM stats:
# These are emitted every 10 seconds by default:
# - [:vm, :memory]
# - [:vm, :total_run_queue_lengths]
# - [:vm, :system_counts]
To customize:
# In your application config
config :telemetry_poller, :default,
period: :timer.seconds(30),
measurements: [
{MyApp.Metrics, :dispatch_metrics, []}
]
Performance Considerations
- Telemetry handlers run synchronously - keep them fast
- Use
:telemetry.span/3for automatic start/end events - Consider sampling high-frequency events in production
- Use
:telemetry_metricsfor aggregation
Agent Introspection
Agent introspection is a higher-level persistence layer built on top of :telemetry. It captures a canonical event envelope for every meaningful agent lifecycle transition and persists it to LemonCore.Store for later query by operators.
What Introspection Events Are
Introspection events are structured records that capture agent execution context at key moments: when a run starts or ends, when a session is created, when a tool executes, when a subprocess (subagent) is spawned, and similar lifecycle transitions. Unlike raw telemetry (which is fire-and-forget), introspection events are persisted and queryable.
Events are retained for 7 days by default and can be queried with filters for run ID, session key, agent ID, event type, and time range.
Event Schema
Every introspection event has the following fields:
| Field | Type | Description |
|---|---|---|
event_id | string | Stable unique identifier (prefixed evt_) |
event_type | atom or string | Event taxonomy name (see taxonomy below) |
ts_ms | integer | Wall-clock timestamp in milliseconds since Unix epoch |
run_id | string or nil | Run identifier |
session_key | string or nil | Session identifier |
agent_id | string or nil | Agent identifier |
parent_run_id | string or nil | Lineage link to parent run when available |
engine | string or nil | Engine name (e.g. "claude", "codex", "lemon") |
provenance | :direct or :inferred or :unavailable | How the context fields were resolved |
payload | map | Event-specific metadata, redacted of secrets |
Provenance Values
:direct— context fields (run_id, session_key, etc.) were provided directly by the emitting code.:inferred— context fields were derived from surrounding state (e.g. process dictionary or ETS).:unavailable— context could not be determined at emit time.
Redaction Defaults
The following payload fields are always removed before persistence:
api_key,apikey,authorization,password,private_keyprompt,response,secret,secrets,stderr,stdout,token
Tool argument fields (arguments, input, tool_arguments) are redacted by default. Pass capture_tool_args: true to LemonCore.Introspection.record/3 to retain them.
Result preview fields (preview, result_preview) are kept by default and truncated to 256 bytes. Pass capture_result_preview: false to suppress them.
All other string values are truncated to 4096 bytes.
Event Taxonomy
Events are organized into three categories:
Run Lifecycle
| Event Type | When Emitted |
|---|---|
run_started | A run is accepted and begins processing |
run_completed | A run finishes successfully |
run_aborted | A run is aborted (user request or error) |
run_queued | A run enters the queue awaiting a slot |
run_followup | A follow-up run is submitted to an active session |
Session Lifecycle
| Event Type | When Emitted |
|---|---|
session_created | A new session is established |
session_expired | A session is cleaned up after TTL expiry |
session_policy_applied | A policy is applied to a session |
Engine / Subprocess Events
| Event Type | When Emitted |
|---|---|
tool_started | A tool begins execution within an agent loop |
tool_completed | A tool finishes (success or error result) |
subagent_spawned | A subprocess agent is spawned |
subagent_completed | A subprocess agent finishes |
engine_loop_started | An engine's main agent loop iteration begins |
engine_loop_completed | An engine's main agent loop iteration ends |
Querying via IEx
Start an IEx session against a running node (or directly with iex -S mix) and use LemonCore.Introspection.list/1:
# All recent events (default limit: 100)
LemonCore.Introspection.list([])
# Filter by run ID
LemonCore.Introspection.list(run_id: "run_abc123", limit: 50)
# Filter by session key
LemonCore.Introspection.list(session_key: "agent:default:main", limit: 20)
# Filter by event type
LemonCore.Introspection.list(event_type: :tool_completed, limit: 30)
# Filter by agent
LemonCore.Introspection.list(agent_id: "my_agent", limit: 50)
# Time range (ms since Unix epoch)
now_ms = System.system_time(:millisecond)
one_hour_ago = now_ms - 60 * 60 * 1000
LemonCore.Introspection.list(since_ms: one_hour_ago, limit: 200)
# Combine filters
LemonCore.Introspection.list(
run_id: "run_abc123",
event_type: :tool_completed,
limit: 10
)
Results are returned newest-first as a list of maps. Each map has the fields described in the schema table above.
Querying via the Mix Task
The mix lemon.introspection task provides a human-readable table view for operators:
# Show the 20 most recent events (default)
mix lemon.introspection
# Increase limit
mix lemon.introspection --limit 100
# Filter by run ID
mix lemon.introspection --run-id run_abc123
# Filter by session key
mix lemon.introspection --session-key "agent:default:main"
# Filter by event type
mix lemon.introspection --event-type tool_completed
# Filter by agent
mix lemon.introspection --agent-id my_agent
# Relative time window (events in the last hour)
mix lemon.introspection --since 1h
# Relative time window (last 30 minutes)
mix lemon.introspection --since 30m
# Absolute time window (ISO 8601)
mix lemon.introspection --since 2026-02-23T00:00:00Z
# Combine filters
mix lemon.introspection --run-id run_abc123 --event-type tool_completed --limit 50
The task outputs a table with columns: Timestamp, Event Type, Run ID, Session Key, Agent ID, Engine, Provenance. Long identifiers are truncated with a ~ suffix.
Emitting Introspection Events
Use LemonCore.Introspection.record/3 to persist an event from any application in the umbrella:
# Basic usage
LemonCore.Introspection.record(:run_started, %{origin: "telegram"}, run_id: run_id, session_key: session_key)
# With engine and agent context
LemonCore.Introspection.record(
:tool_completed,
%{tool_name: "exec", result_preview: "ok"},
run_id: run_id,
session_key: session_key,
agent_id: "default",
engine: "codex"
)
# With custom provenance
LemonCore.Introspection.record(
:run_aborted,
%{reason: "user_requested"},
run_id: run_id,
session_key: session_key,
provenance: :inferred
)
Disabling Introspection
To disable persistence of introspection events (events are silently dropped), add to your config:
# config/config.exs or config/prod.exs
config :lemon_core, :introspection, enabled: false
When disabled, LemonCore.Introspection.record/3 returns :ok immediately without touching the store. The LemonCore.Introspection.enabled?/0 function reflects the current setting.
Retention
The store applies a periodic retention sweep to the :introspection_log table. By default, events older than 7 days are pruned. This sweep runs at the same interval as the chat-state sweep (every 5 minutes).