WAL Compaction and Truncation
May 18, 2026 · View on GitHub
Managing Write-Ahead Log growth in NornicDB
Last Updated: December 2025
Overview
NornicDB's Write-Ahead Log (WAL) supports automatic compaction to prevent unbounded growth. Without compaction, the WAL would grow indefinitely in long-running databases, consuming disk space and slowing recovery.
Problem Solved: WAL grows forever until manual snapshot + delete Solution: Automatic periodic snapshots with WAL truncation
Features
1. Automatic Compaction (Recommended)
Automatic compaction is the recommended approach for production deployments. Enable it through configuration:
YAML configuration:
database:
wal_dir: "data/wal"
wal_sync_mode: "batch"
wal_snapshot_interval: "1h" # Create snapshots hourly
wal_auto_compaction_enabled: true # Enabled by default
wal_snapshot_dir: "data/snapshots"
Environment variables:
export NORNICDB_WAL_SNAPSHOT_INTERVAL=1h
export NORNICDB_WAL_AUTO_COMPACTION_ENABLED=true
Behavior:
- Snapshots created at configured interval (default: 1 hour)
- WAL truncated after each successful snapshot
- Failures logged but don't crash the database
- Automatic retry on next interval
- Old snapshots saved to the snapshot directory as
snapshot-<timestamp>.json
2. Manual WAL Truncation
For development or special cases, you can trigger truncation manually. Create a snapshot and then truncate the WAL to remove all entries before the snapshot point.
Safety Guarantees:
- Atomic rename (crash-safe)
- Old WAL remains intact until truncation succeeds
- Can retry truncation if it fails
- Recovery works from partial truncations
3. Disable Automatic Compaction
database:
wal_auto_compaction_enabled: false
export NORNICDB_WAL_AUTO_COMPACTION_ENABLED=false
4. Retention Settings (Immutable Segments)
NornicDB stores WAL as immutable segments with a manifest. You can retain sealed segments for audit/ledger use cases.
YAML configuration:
database:
wal_retention_max_segments: 24
wal_retention_max_age: "168h" # 7 days
Environment variables:
export NORNICDB_WAL_RETENTION_MAX_SEGMENTS=24
export NORNICDB_WAL_RETENTION_MAX_AGE=168h
export NORNICDB_WAL_LEDGER_RETENTION_DEFAULTS=true
These settings retain sealed WAL segments after snapshots. Auto-compaction remains enabled by default to preserve existing behavior; retention is opt-in.
5. Txlog Query Procedures
You can query WAL entries directly via Cypher:
// Scan recent entries (no args = recent window)
CALL db.txlog.entries() YIELD txId, db, kind, seq, timestamp, payload
RETURN seq, kind, txId, timestamp, payload
ORDER BY seq;
// Read entries for a specific transaction
CALL db.txlog.byTxId('tx-123') YIELD txId, db, kind, seq, timestamp, payload
RETURN seq, kind, txId, timestamp, payload
ORDER BY seq;
db.txlog.entries accepts up to 4 optional positional args (filter parameters); pass none for a recent-window scan. db.txlog.byTxId takes a single transaction ID. The yield columns are fixed: txId, db, kind, seq, timestamp, payload.
How It Works
Compaction Process
When a compaction cycle runs, NornicDB flushes any pending writes, then creates a point-in-time snapshot of the current database state. Once the snapshot is safely persisted, the WAL is rewritten to contain only entries that arrived after the snapshot. The old WAL file is replaced via an atomic rename so the operation is crash-safe — at no point can a crash leave the WAL in a partial or corrupt state.
Crash Safety
The truncation process is crash-safe at every step:
- Before rename: Old WAL is intact
- During rename: Atomic operation (old or new, never partial)
- After rename: New WAL is complete and synced
If a crash occurs:
- Before rename: Old WAL used on recovery (full history)
- After rename: New WAL used on recovery (snapshot + delta)
Recovery
With auto-compaction enabled:
Recovery = Latest Snapshot + Post-Snapshot WAL Entries
Example timeline:
T=0: Database starts
T=1h: Snapshot 1 created (100 nodes), WAL truncated
T=2h: Snapshot 2 created (150 nodes), WAL truncated
T=2.5h: Crash occurs (170 nodes in database)
Recovery:
Load Snapshot 2 (150 nodes)
+ Replay WAL since T=2h (20 new nodes)
= 170 nodes recovered
Performance Impact
Disk Space
Before compaction:
WAL size grows unbounded:
After 1 day: ~10GB
After 1 week: ~70GB
After 1 month: ~300GB
After compaction (hourly):
WAL size bounded by interval:
Maximum size: ~500MB (1 hour of writes)
Average size: ~250MB
Disk savings: 99%+
Recovery Time
Before compaction:
Recovery time = O(total history)
1 day: ~30 seconds
1 week: ~3 minutes
1 month: ~15 minutes
After compaction:
Recovery time = Snapshot load + O(interval writes)
Load snapshot: ~2 seconds
Replay WAL: ~1 second
Total: ~3 seconds (constant!)
Runtime Overhead
- Snapshot creation: ~2-5ms per 1000 nodes (async, doesn't block writes)
- WAL truncation: ~10-50ms (happens every hour, negligible amortized cost)
- Total overhead: <0.001% of runtime
Configuration
WAL Settings
| Setting | Default | Description |
|---|---|---|
wal_dir | data/wal | WAL directory |
wal_sync_mode | batch | Sync mode: immediate, batch, or none |
wal_batch_sync_interval | 100ms | Batch sync frequency |
wal_max_file_size | 100MB | File rotation trigger (bytes) |
wal_max_entries | 100000 | File rotation trigger (count) |
wal_snapshot_interval | 1h | Auto-compaction frequency |
wal_auto_compaction_enabled | true | Enable/disable auto-compaction |
Tuning Snapshot Interval
Aggressive (every 15 minutes):
- Minimal WAL size
- Faster recovery
- More snapshot overhead
- Good for: High-write, limited disk space
Moderate (every hour — default):
- Balanced disk usage
- Good recovery time
- Low overhead
- Good for: Most use cases
Conservative (every 6 hours):
- Larger WAL size
- Slower recovery
- Minimal overhead
- Good for: Low-write, plenty of disk space
Monitoring
NornicDB exposes compaction metrics that you can use to monitor WAL health:
- Total snapshots created — number of successful compaction cycles since startup
- Last snapshot time — timestamp of the most recent snapshot
- WAL entry count — current number of entries in the active WAL
- WAL bytes written — total bytes in the active WAL
These metrics are available through the server's diagnostics and can be monitored via the admin UI or log output.
Troubleshooting
Issue: WAL still growing despite auto-compaction
Check:
-
Verify auto-compaction is enabled in your configuration (
wal_auto_compaction_enabled: true) -
Check snapshot directory for recent files:
ls -lh data/snapshots/ # Should see snapshot-<timestamp>.json files -
Check WAL size:
ls -lh data/wal/wal.log
Issue: Truncation errors
Symptom: Logs show "failed to truncate WAL"
Causes:
- Disk full
- Permission issues
- WAL file locked by another process
Solution:
# Check disk space
df -h
# Check permissions
ls -l data/wal/
chmod 644 data/wal/wal.log
# Check for locks
lsof | grep wal.log
Issue: Slow recovery after crash
Check snapshot age:
ls -lt data/snapshots/ | head -1
If snapshot is old, auto-compaction may not be running. Verify your configuration and check server logs for compaction errors.
Best Practices
-
Always enable auto-compaction in production — this is the default and should not be disabled unless you have a specific reason.
-
Monitor snapshot creation — check server logs or metrics to confirm snapshots are being created at the expected interval.
-
Rotate old snapshots to avoid filling disk with historical snapshots:
find data/snapshots -name "snapshot-*.json" -mtime +7 -delete -
Test recovery regularly — periodically verify that your latest snapshot can be loaded and that the WAL replays correctly.