Database Output for Memory Profiler

April 30, 2026 · View on GitHub

Note

.rmem (-f rmem) is the primary storage format — it's the fastest and is what rmem:explore / rmem:query / rmem:serve / rmem:mcp read natively. inspector:memory:report / inspector:memory:compare also prefer it. Reach for the SQL-database outputs documented on this page when you specifically need ad-hoc SQL querying, JOIN-based analysis against your own tables, or ingestion into an existing data-warehouse pipeline. (inspector:memory:report and inspector:memory:compare accept SQLite too, so you can use the database outputs alongside the normal analysers.)

The memory profiler supports outputting analysis results to a relational database. This is useful for analyzing large memory snapshots without jq, and enables ad-hoc querying with SQL.

Supported databases:

  • SQLite (-f sqlite3) — local file, no server needed
  • MySQL (-f mysql) — connect to an existing MySQL/MariaDB server
  • PostgreSQL (-f postgresql) — connect to an existing PostgreSQL server

Quick Start

SQLite (local file)

sudo ./reli inspector:memory -p <pid> -f sqlite3 -o memory.db
sqlite3 memory.db "SELECT * FROM location_types_summary ORDER BY memory_usage DESC"

MySQL

# Create the database first
mysql -u root -e "CREATE DATABASE reli_memory"

sudo ./reli inspector:memory -p <pid> -f mysql --db-name reli_memory --db-user root
mysql -u root reli_memory -e "SELECT * FROM location_types_summary ORDER BY memory_usage DESC"

PostgreSQL

# Create the database first
createdb reli_memory

sudo ./reli inspector:memory -p <pid> -f postgresql --db-name reli_memory --db-user postgres
psql reli_memory -c "SELECT * FROM location_types_summary ORDER BY memory_usage DESC"

Command Options

Output format

OptionShortDescription
--output-format-fOutput format: json (default), rmem (.rmem), sqlite3, mysql, postgresql, report, report-json
--output-oOutput file path (required for rmem / sqlite3 / report-json)

This page focuses on sqlite3 / mysql / postgresql. rmem (.rmem) is the primary storage format — see the note at the top of the page. report / report-json emit a findings report directly; see memory-report.md.

Database connection (for mysql / postgresql)

OptionDefaultDescription
--db-host127.0.0.1Database server host
--db-port3306 / 5432Database server port (auto-detected from format)
--db-name(required)Database name
--db-user(required)Database user
--db-password(empty)Database password

PHP memory limit

OptionDefaultDescription
--memory-limitSet PHP memory_limit for analysis (e.g. 2G, 512M)

All other options of inspector:memory work the same regardless of output format.

Database Schema

Runs Table

runs

Each invocation of inspector:memory with a database output format creates a new run. This allows multiple snapshots to coexist in the same database for comparison.

ColumnTypeDescription
run_idINTEGER (PK)Auto-incremented run identifier
created_atTEXTISO 8601 timestamp (UTC) of when the run was created
SELECT * FROM runs;

Summary Tables

summary

Key-value pairs containing metadata about the analysis snapshot.

ColumnTypeDescription
run_idINTEGER (PK)References runs.run_id
keyTEXT (PK)Metric name (e.g. zend_mm_heap_total, php_version)
valueTEXTMetric value
SELECT * FROM summary WHERE run_id = 1;

Well-known keys produced for every run:

KeyMeaning
php_versionPHP version of the target process (e.g. v84)
memory_get_usageUserland memory_get_usage()zend_mm_heap->size
memory_get_real_usageUserland memory_get_usage(true)zend_mm_heap->real_size
memory_get_peak_usagezend_mm_heap->peak
memory_limitzend_mm_heap->limit
zend_mm_heap_total / zend_mm_heap_usageAnalyzed heap totals per region
vm_stack_total / compiler_arena_totalRegion totals from RegionAnalyzer
heap_memory_analyzed_percentageFraction of memory_get_usage covered by walked locations
rssProcess RSS at capture time (when available)
cached_chunks_sizecached_chunks_count * 2 MiB — bytes held in the chunk cache
chunks_countzend_mm_heap->chunks_count — in-use chunks
peak_chunks_countzend_mm_heap->peak_chunks_count
cached_chunks_countzend_mm_heap->cached_chunks_count
last_chunks_delete_boundaryChunk count at the last delete event — heuristic input
last_chunks_delete_countConsecutive deletes at that boundary — bumps cached_chunks_max on reaching 4
chunks_total_free_bytesSum of free_pages * 4 KiB across walked chunks (scattered free space)
chunks_mostly_empty_countWalked chunks that are ≥90% free — fragmentation-pinned candidates

location_types_summary

Aggregated memory usage by location type, computed via GROUP BY from context_node_locations.

ColumnTypeDescription
run_idINTEGER (PK)References runs.run_id
typeTEXT (PK)Location type (e.g. ZendStringMemoryLocation, ZendObjectMemoryLocation)
countINTEGERNumber of locations of this type
memory_usageINTEGERTotal bytes consumed
SELECT * FROM location_types_summary WHERE run_id = 1 ORDER BY memory_usage DESC;

class_objects_summary

Aggregated memory usage by class, computed via GROUP BY from context_node_locations.

ColumnTypeDescription
run_idINTEGER (PK)References runs.run_id
class_nameTEXT (PK)Fully qualified class name
countINTEGERNumber of instances
memory_usageINTEGERTotal bytes consumed
SELECT * FROM class_objects_summary WHERE run_id = 1 ORDER BY memory_usage DESC;

Context Graph Tables

The context graph (reference graph) is stored as separate nodes and edges, enabling natural representation of shared references. This is equivalent to the "context" field in the JSON output.

context_nodes

Each node represents a unique context (object, array, string, etc.) in the memory graph.

ColumnTypeDescription
run_idINTEGER (PK)References runs.run_id
node_idINTEGER (PK)Unique node identifier within this run (same as #node_id in JSON)
typeTEXTContext type (e.g. ObjectContext, ArrayHeaderContext)
canonical_node_idINTEGERRepresentative node_id for nodes sharing the same memory address. NULL if this node's address is unique. When non-NULL, groups nodes that represent the same PHP value observed from different collection phases (e.g. objects_store vs call_frames in streaming mode). Used for SCC (cycle) detection across collection phases.

context_edges

Edges represent parent-child relationships in the reference graph. A node may have multiple incoming edges (shared references).

ColumnTypeDescription
run_idINTEGERReferences runs.run_id
parent_node_idINTEGERParent node (NULL for root nodes)
child_node_idINTEGERChild node (references context_nodes.node_id)
link_nameTEXTName of the link (e.g. property name, variable name, array key)
is_treeINTEGER1 = DFS first-visit (spanning tree edge), 0 = back-reference (shared/circular)
strengthTEXTReference strength: strong (default, increments refcount), weak (objects_store bucket, no refcount), structural (VM internals like object_handlers, class_entry)

The is_tree flag distinguishes the DFS spanning tree from back-references:

  • is_tree = 1 edges form a tree where each node has exactly one parent — use these for canonical path queries
  • is_tree = 0 edges represent additional references to already-visited nodes — these capture shared and circular references

The strength column classifies the reference by its refcount behavior:

  • strong — normal references that increment the target's refcount (object properties, local variables, array elements)
  • weak — references that do not increment refcount (objects_store handle table entries)
  • structural — PHP VM internal structures that are not ownership references (object_handlers, class_entry)
-- Find all places that reference node 42 in run 1
SELECT parent_node_id, link_name, is_tree
FROM context_edges
WHERE run_id = 1 AND child_node_id = 42;

-- Find strong non-tree edges (true shared references, excluding VM internals)
SELECT parent_node_id, child_node_id, link_name
FROM context_edges
WHERE run_id = 1 AND is_tree = 0 AND strength = 'strong';

-- Find all graph nodes that represent the same PHP object
SELECT node_id, type FROM context_nodes
WHERE run_id = 1 AND canonical_node_id = 100;

context_node_locations

Physical memory locations associated with nodes. A node may have multiple locations (e.g. an array has both a header and a table location).

ColumnTypeDescription
idINTEGER (PK)Auto-incremented row ID
run_idINTEGERReferences runs.run_id
node_idINTEGER (FK)References context_nodes.node_id
addressINTEGERMemory address in the target process
sizeINTEGERSize in bytes
location_typeTEXTType of memory location (e.g. ZendStringMemoryLocation)
class_nameTEXTClass name (for object locations)
string_valueTEXTString value (for string locations)
refcountINTEGERReference count
type_infoINTEGERZval type info flags
regionTEXTMemory region (zend_mm_heap, zend_mm_huge, vm_stack, compiler_arena, or NULL)

context_node_attributes

Key-value metadata attached to nodes (e.g. #count for array element counts).

ColumnTypeDescription
idINTEGER (PK)Auto-incremented row ID
run_idINTEGERReferences runs.run_id
node_idINTEGER (FK)References context_nodes.node_id
keyTEXTAttribute name
valueTEXTAttribute value

Indexes

The following indexes are created after bulk insertion for query performance:

  • idx_context_edges_run_parent_tree on context_edges(run_id, parent_node_id, is_tree) -- essential for recursive CTE path queries
  • idx_context_edges_run_child on context_edges(run_id, child_node_id) -- essential for "find all references to node X"
  • idx_context_node_locations_run_node on context_node_locations(run_id, node_id) -- essential for joining locations to nodes
  • idx_context_node_locations_run_class on context_node_locations(run_id, class_name) -- speeds up class-based queries
  • idx_context_node_attributes_run_node on context_node_attributes(run_id, node_id) -- for attribute lookups
  • idx_context_node_locations_run_type on context_node_locations(run_id, location_type) -- type-based filtering
  • idx_context_edges_run_link_parent_tree on context_edges(run_id, link_name, parent_node_id, is_tree) -- link-based queries
  • idx_context_edges_run_strength on context_edges(run_id, strength) -- strength-based filtering
  • idx_context_nodes_canonical on context_nodes(run_id, canonical_node_id) -- canonical node lookups for SCC

Views

v_node_paths

Computes the canonical path from root to each node using a recursive CTE that follows only is_tree = 1 edges. Each node has exactly one path per run. This view is convenient but can be slow on large databases.

ColumnTypeDescription
run_idINTEGERRun identifier
node_idINTEGERNode identifier
pathTEXTFull path (e.g. call_frames -> 0 -> local_variables -> myVar -> object_properties -> items)
depthINTEGERDepth in the tree (0 = root)
-- Find where a specific class is referenced in run 1
SELECT np.path, loc.size, loc.refcount
FROM v_node_paths np
JOIN context_node_locations loc ON loc.run_id = np.run_id AND loc.node_id = np.node_id
WHERE np.run_id = 1 AND loc.class_name = 'App\\MyClass'
ORDER BY loc.size DESC;

Performance note: For large databases (100K+ nodes), queries against v_node_paths can be slow because the recursive CTE is recomputed on every query. Use inspector:optimize-memory-db to materialize paths into a real table for faster queries.

v_arrays

Provides a convenient view of PHP array memory usage by joining array headers with their table allocations.

ColumnTypeDescription
run_idINTEGERRun identifier
node_idINTEGERNode identifier of the array header
addressINTEGERMemory address
header_sizeINTEGERSize of the zend_array header
table_sizeINTEGERSize of the hash table allocation
total_sizeINTEGERheader_size + table_size
element_countINTEGERNumber of elements in the array
refcountINTEGERReference count

Optimizing the Database: inspector:optimize-memory-db

For large databases, the v_node_paths view (recursive CTE) can be slow. The inspector:optimize-memory-db command materializes all node paths into a physical node_paths table for much faster queries.

# Materialize all runs
./reli inspector:optimize-memory-db memory.db

# Materialize a specific run only
./reli inspector:optimize-memory-db memory.db --run-id=1

This creates a node_paths table with the same schema as the v_node_paths view, plus an index on (run_id, depth). After materialization, you can query node_paths directly instead of v_node_paths.

Note: This command currently supports SQLite databases only. For MySQL/PostgreSQL, you can run the equivalent INSERT INTO ... WITH RECURSIVE SQL manually.

Note: Materialization can significantly increase the database size. For databases with deep reference trees, path strings can average several KB each, so the node_paths table may be much larger than the rest of the data combined.

The node_paths table (after optimization)

ColumnTypeDescription
run_idINTEGER (PK)Run identifier
node_idINTEGER (PK)Node identifier
pathTEXTFull path from root
depthINTEGERDepth in the tree

Index: idx_node_paths_run_depth on node_paths(run_id, depth)

Example Queries

All example queries below use WHERE run_id = 1 to target a specific run. Replace 1 with the desired run ID, or omit the condition to query across all runs.

Top memory consumers by location type

SELECT type, count, memory_usage,
       printf('%.1f%%', memory_usage * 100.0 / (SELECT SUM(memory_usage) FROM location_types_summary WHERE run_id = 1)) AS pct
FROM location_types_summary
WHERE run_id = 1
ORDER BY memory_usage DESC;

Top memory-consuming classes

SELECT class_name, count, memory_usage,
       CAST(memory_usage / count AS INTEGER) AS avg_size
FROM class_objects_summary
WHERE run_id = 1
ORDER BY memory_usage DESC
LIMIT 20;

Duplicate strings (candidates for interning)

SELECT string_value,
       COUNT(*) AS copies,
       SUM(size) AS total_bytes,
       size AS per_copy_bytes
FROM context_node_locations
WHERE run_id = 1
  AND location_type = 'ZendStringMemoryLocation'
  AND string_value IS NOT NULL
GROUP BY string_value, size
HAVING COUNT(*) > 1
ORDER BY total_bytes DESC
LIMIT 20;

High refcount values (widely shared structures)

SELECT location_type, class_name, printf('0x%x', address) AS hex_addr,
       size, refcount
FROM context_node_locations
WHERE run_id = 1 AND refcount IS NOT NULL
ORDER BY refcount DESC
LIMIT 20;

Memory distribution by tree depth

WITH RECURSIVE node_depth(node_id, depth) AS (
    SELECT child_node_id, 0
    FROM context_edges WHERE run_id = 1 AND parent_node_id IS NULL AND is_tree = 1
    UNION ALL
    SELECT e.child_node_id, nd.depth + 1
    FROM context_edges e
    JOIN node_depth nd ON e.parent_node_id = nd.node_id
    WHERE e.run_id = 1 AND e.is_tree = 1
)
SELECT depth,
       COUNT(DISTINCT nd.node_id) AS nodes,
       SUM(loc.size) AS total_bytes,
       printf('%.1f%%', SUM(loc.size) * 100.0 / (SELECT SUM(size) FROM context_node_locations WHERE run_id = 1)) AS pct
FROM node_depth nd
JOIN context_node_locations loc ON loc.run_id = 1 AND loc.node_id = nd.node_id
GROUP BY depth
ORDER BY depth
LIMIT 30;

Empty arrays (allocated but unused)

SELECT va.node_id, va.total_size, va.element_count, va.refcount,
       np.path
FROM v_arrays va
JOIN v_node_paths np ON np.run_id = va.run_id AND np.node_id = va.node_id
WHERE va.run_id = 1 AND va.element_count = 0
ORDER BY va.total_size DESC
LIMIT 20;

Back-references (shared or circular references)

Edges with is_tree = 0 are back-references to already-visited nodes in the DFS traversal. This includes both truly circular references (A -> B -> A) and shared references (A -> C, B -> C).

-- All back-references (shared + circular) in run 1
SELECT parent_node_id, child_node_id, link_name
FROM context_edges
WHERE run_id = 1 AND is_tree = 0
LIMIT 20;

Top root paths by memory consumption (depth <= 2)

WITH RECURSIVE subtree(node_id, root_path) AS (
    SELECT e.child_node_id, e.link_name
    FROM context_edges e WHERE e.run_id = 1 AND e.parent_node_id IS NULL AND e.is_tree = 1
    UNION ALL
    SELECT e.child_node_id,
           CASE WHEN st.root_path = '' THEN e.link_name
                ELSE st.root_path || ' -> ' || e.link_name END
    FROM context_edges e
    JOIN subtree st ON e.parent_node_id = st.node_id
    WHERE e.run_id = 1 AND e.is_tree = 1
),
root_two AS (
    SELECT node_id,
           CASE
             WHEN instr(root_path, ' -> ') > 0
               THEN substr(root_path, 1,
                    CASE WHEN instr(substr(root_path, instr(root_path, ' -> ') + 4), ' -> ') > 0
                         THEN instr(root_path, ' -> ') + 3 + instr(substr(root_path, instr(root_path, ' -> ') + 4), ' -> ') - 1
                         ELSE length(root_path) END)
             ELSE root_path
           END AS root_prefix
    FROM subtree
)
SELECT root_prefix,
       COUNT(DISTINCT rt.node_id) AS nodes,
       SUM(loc.size) AS total_bytes,
       printf('%.1f%%', SUM(loc.size) * 100.0 / (SELECT SUM(size) FROM context_node_locations WHERE run_id = 1)) AS pct
FROM root_two rt
JOIN context_node_locations loc ON loc.run_id = 1 AND loc.node_id = rt.node_id
GROUP BY root_prefix
ORDER BY total_bytes DESC
LIMIT 15;

Largest strings (log buffers, serialized data)

SELECT size, refcount,
       printf('0x%x', address) AS hex_addr,
       substr(string_value, 1, 80) AS preview
FROM context_node_locations
WHERE run_id = 1 AND location_type = 'ZendStringMemoryLocation'
ORDER BY size DESC
LIMIT 10;

Array table waste (over-allocated hash tables)

SELECT va.node_id, va.element_count, va.header_size, va.table_size, va.total_size,
       CASE WHEN va.element_count > 0 THEN va.table_size / va.element_count ELSE NULL END AS bytes_per_elem
FROM v_arrays va
WHERE va.run_id = 1 AND va.element_count > 0 AND va.table_size > 0
ORDER BY (CAST(va.table_size AS REAL) / va.element_count) DESC
LIMIT 15;

Region x location type matrix

SELECT region, location_type,
       COUNT(*) AS count,
       SUM(size) AS total_bytes
FROM context_node_locations
WHERE run_id = 1 AND region IS NOT NULL
GROUP BY region, location_type
ORDER BY total_bytes DESC
LIMIT 20;

Find all references to a specific node (equivalent to JSON #reference_node_id lookup)

-- Find all edges pointing to node 42 in run 1 (both tree and back-references)
SELECT e.parent_node_id, e.link_name, e.is_tree
FROM context_edges e
WHERE e.run_id = 1 AND e.child_node_id = 42;

All non-circular paths to a specific node

-- All paths from root to node 42 in run 1, avoiding cycles
WITH RECURSIVE cte(node_id, path, depth, visited) AS (
    SELECT child_node_id, link_name, 0, ',' || child_node_id || ','
    FROM context_edges WHERE run_id = 1 AND parent_node_id IS NULL
  UNION ALL
    SELECT e.child_node_id,
           cte.path || ' -> ' || e.link_name,
           cte.depth + 1,
           cte.visited || e.child_node_id || ','
    FROM context_edges e
    JOIN cte ON e.parent_node_id = cte.node_id
    WHERE e.run_id = 1 AND cte.visited NOT LIKE '%,' || e.child_node_id || ',%'
)
SELECT path, depth FROM cte
WHERE node_id = 42
LIMIT 100;

Comparing runs: memory growth by class

SELECT
    r1.class_name,
    r1.count AS count_run1,
    r2.count AS count_run2,
    r2.count - r1.count AS count_diff,
    r1.memory_usage AS mem_run1,
    r2.memory_usage AS mem_run2,
    r2.memory_usage - r1.memory_usage AS mem_diff
FROM class_objects_summary r1
JOIN class_objects_summary r2 ON r1.class_name = r2.class_name
WHERE r1.run_id = 1 AND r2.run_id = 2
ORDER BY mem_diff DESC
LIMIT 20;

Comparing runs: new classes in later run

SELECT r2.class_name, r2.count, r2.memory_usage
FROM class_objects_summary r2
LEFT JOIN class_objects_summary r1 ON r1.class_name = r2.class_name AND r1.run_id = 1
WHERE r2.run_id = 2 AND r1.class_name IS NULL
ORDER BY r2.memory_usage DESC;

Choosing an Output Format

AspectJSONSQLiteMySQL / PostgreSQL
SetupNoneNoneRequires a running server
Output speedBaselineFaster (deferred summaries)Faster (deferred summaries)
File sizeLarge for deep treesCompact (normalized)N/A (server-side)
Queryabilityjq (limited by depth)Full SQLFull SQL
Toolingjq, gojq, jjsqlite3 CLImysql / psql CLI, any SQL client
StreamingPipe to stdoutRequires -o file pathWrites to remote server
Remote useMust copy fileMust copy fileDirect write, no file transfer
Team sharingShare fileShare fileConnect to shared server
Multi-run comparisonSeparate filesSame DB, compare via run_idSame DB, compare via run_id
Path queriesjq path(...)v_node_paths / node_pathsv_node_paths / manual CTE
Reference lookupjq + #reference_node_idcontext_edges WHERE child_node_id = ?Same
Large snapshotsMay exceed jq depth limitHandles wellHandles well

Recommendations:

  • Local analysis: SQLite is the simplest — no setup, single file, portable
  • Remote servers: MySQL/PostgreSQL avoids copying large files off the server
  • Team collaboration: MySQL/PostgreSQL lets multiple people query the same results
  • Small snapshots: JSON + jq is fine for quick one-off checks

For snapshots with more than ~100K nodes, any database format is recommended over JSON.

Database Compatibility Notes

The schema and views are designed to work across all three databases. Key dialect differences are handled automatically:

FeatureSQLiteMySQLPostgreSQL
Duplicate-safe insertINSERT OR IGNOREINSERT IGNOREINSERT ... ON CONFLICT DO NOTHING
String concatenation||CONCAT()||
Auto-increment PKINTEGER PRIMARY KEYINTEGER PRIMARY KEY AUTO_INCREMENTSERIAL PRIMARY KEY
Bulk insert tuningPRAGMA journal_mode=OFF, etc.SET unique_checks=0, etc.SET synchronous_commit=off

Note: The example queries in this document use SQLite syntax (printf(), ||). When using MySQL, replace || with CONCAT() and printf() with FORMAT() or CONCAT() as appropriate. PostgreSQL uses || like SQLite but uses format() or to_char() instead of printf().