cap-0062.md

September 4, 2025 · View on GitHub

CAP: 0062
Title: Soroban Live State Prioritization
Working Group:
    Owner: Garand Tyson <@SirTyson>
    Authors: Garand Tyson <@SirTyson>
    Consulted: Dmytro Kozhevin <@dmkozh>, Nicolas Barry <@MonsieurNicolas>
Status: Final
Created: 2024-12-02
Discussion: https://github.com/stellar/stellar-protocol/discussions/1575
Protocol version: 23

Simple Summary

This proposal allows the network to evict PERSISTENT entries from the live state BucketList to a separate BucketList containing archival state. Note that evicted entries are not deleted from validators.

Working Group

As specified in the Preamble.

Motivation

This is a first step towards full State Archival, which will lower the storage requirements of validators and decrease the growth of History Archives. Additionally, this opens the door for database optimizations of live Soroban state, increasing limits and throughput, as well as automatic entry restoration.

Goals Alignment

This change is aligned with the goal of lowering the cost and increasing the scale of the network.

Abstract

Currently both live and archived Soroban state are maintained in a single BucketList DB. This makes data prioritization difficult since all state exists in a singular data structure. This proposal separates archived state and live state into two separate DBs. While these DBs are still both persisted on disk by validators, optimizing live state access is significantly easier. In particular, live state can be entirely cached in memory, removing disk reads from the Soroban transaction execution path, greatly increasing read limits and throughput.

Each validator will maintain two BucketLists, the Live BucketList and the Hot Archive BucketList. PERSISTENT Soroban data entries and code entries that have expired will be "evicted" from the Live BucketList and moved to the Hot Archive BucketList. Both the Live BucketList and Hot Archive BucketList are persisted on disk and written to History Archives.

Note that this proposal is a subset of CAP-0057. The Hot Archive here is the same as CAP-0057, and this CAP is a strict subset of CAP-0057. The purpose is not to replace CAP-0057, but to offer a smaller first step implementation wise.

Specification

XDR changes

enum BucketListType
{
    LIVE = 0,
    HOT_ARCHIVE = 1,
};

enum HotArchiveBucketEntryType
{
    HOT_ARCHIVE_METAENTRY = -1, // Bucket metadata, should come first.
    HOT_ARCHIVE_ARCHIVED = 0,   // Entry is Archived
    HOT_ARCHIVE_LIVE = 1,       // Entry was previously HOT_ARCHIVE_ARCHIVED but
                                // has been added back to the live BucketList.
};

union HotArchiveBucketEntry switch (HotArchiveBucketEntryType type)
{
case HOT_ARCHIVE_ARCHIVED:
    LedgerEntry archivedEntry;
case HOT_ARCHIVE_LIVE:
    LedgerKey key;
case HOT_ARCHIVE_METAENTRY:
    BucketMetadata metaEntry;
};

struct LedgerCloseMetaV1
{
    ...
    // same as LedgerCloseMetaV1 up to here

    // Soroban code, data, and TTL keys that are being evicted at this ledger.
    // Note: This has been renamed from evictedTemporaryLedgerKeys in V1
    LedgerKey evictedLedgerKeys<>;

    ...
};

enum LedgerEntryChangeType
{
    LEDGER_ENTRY_CREATED = 0, // entry was added to the ledger
    LEDGER_ENTRY_UPDATED = 1, // entry was modified in the ledger
    LEDGER_ENTRY_REMOVED = 2, // entry was removed from the ledger
    LEDGER_ENTRY_STATE = 3,   // value of the entry
    LEDGER_ENTRY_RESTORE = 4  // archived entry was restored in the ledger
};

union LedgerEntryChange switch (LedgerEntryChangeType type)
{
case LEDGER_ENTRY_CREATED:
    LedgerEntry created;
case LEDGER_ENTRY_UPDATED:
    LedgerEntry updated;
case LEDGER_ENTRY_REMOVED:
    LedgerKey removed;
case LEDGER_ENTRY_STATE:
    LedgerEntry state;
case LEDGER_ENTRY_RESTORE:
    LedgerEntry restored;
};

Semantics

With this change, the current BucketList is divided into two separate BucketLists as follows.

Live State BucketList

The Live State BucketList most closely resembles the current BucketList. It will contain all “live” state of the ledger, including Stellar classic entries, live Soroban entries, network config settings, etc. The Live State BucketList is published to the History Archive on every checkpoint ledger via the "history" category. The Live BucketList functions identically to the current BucketList.

Hot Archive BucketList

The Hot Archive is a BucketList that stores evicted PERSISTENT entries. It contains HotArchiveBucketEntry type entries and is constructed as follows:

  1. Whenever a PERSISTENT entry is evicted, the entry is deleted from the Live State BucketList and added to the Hot Archive as a HOT_ARCHIVE_ARCHIVED entry. The corresponding TTLEntry is deleted and not stored in the Hot Archive.
  2. If an archived entry is restored and the entry currently exists in the Hot Archive, the HOT_ARCHIVE_ARCHIVED node previously stored in the Hot Archive is overwritten by a HOT_ARCHIVE_LIVE entry. Similar to DEADENTRY in the Live BucketList, HOT_ARCHIVE_LIVE are dropped when the bottom most Bucket merges.

For Bucket merges, the newest version of a given key is always taken. At the bottom level, HOT_ARCHIVE_LIVE entries are dropped. The HOT_ARCHIVE_LIVE state indicates that the given key currently exists in the Live BucketList. Thus, any Hot Archive reference is out of date and can be dropped.

The current Hot Archive is published to the History Archive via the "history" category on every checkpoint ledger.

Hot Archive BucketList Depth and Spill Schedule

The Hot Archive is guaranteed to experience less churn than the Live BucketList, given that the only events than can modify the Hot Archive BucketList are eviction and restoration. Thus, the spill schedule and depth of the Hot Archive BucketList should not match that of the Live BucketList. Long term, it is expected that the Hot Archive BucketList become significantly bigger than the Live BucketList (assuming State Archival for classic entry types and increased Soroban adoption).

Specific configurations will need to be investigated, but a slower spill schedule is most likely optimal. A deeper BucketList (to accommodate the expected size difference between Live and Hot Archive) may not be necessary, as the spill schedule impacts the maximum "capacity" of a BucketList along with the depth.

Changes to History Archives

The HistoryArchiveState will be extended as follows:

version: number identifying the file format version
server: an optional debugging string identifying the software that wrote the file
networkPassphrase: an optional string identifying the networkPassphrase
currentLedger: a number denoting the ledger this file describes the state of
currentBuckets: an array containing an encoding of the live state bucket list for this ledger
hotArchiveBuckets: an array containing an encoding of the hot archive bucket list for this ledger

Bucket files for the hotArchiveBuckets will be stored just as Bucket files are currently stored.

Changes to LedgerHeader

While the XDR structure of LedgerHeader remains unchanged, bucketListHash will be changed to the following:

header.bucketListHash = SHA256(liveStateBucketListHash, hotArchiveHash)

Meta

Whenever a PERSISTENT entry is evicted (i.e. removed from the Live State BucketList and added to the Hot Archive), the entry key and its associated TTL key will be emitted via evictedLedgerKeys (this field has been renamed and was previously named evictedTemporaryLedgerKeys).

Whenever an entry is restored via RestoreFootprintOp, the LedgerEntry being restored and its associated TTL will be emitted as a LedgerEntryChange of type LEDGER_ENTRY_RESTORE. Note that entries can be restored from both the Live State and Hot Archive BucketList, because archived entries continue to exist in Live State until they are evicted as part of natural BucketList growth. The emitted meta does not distinguish between these.

Default Values for Network Configs

sorobanLiveStateTargetSizeBytes (formerly bucketListTargetSizeBytes) will need to be changed to reflect the new size calculations. Additionally, the "fee curve" may need to be revisted now that classic state growth no longer affects Soroban fees.

contractMaxSizeBytes will need to be increased to account for the new size calculations.

These changes will be included in the protocol 23 upgrade and will not be subject to a separate Soroban settings upgrade.

TODO: Initial values for network configs.

Design Rationale

CAP-0062 vs. CAP-0057

This CAP is a strict subset of CAP-0057. The intention is not to replace CAP-0057, but to implement this CAP first as a step towards CAP-0057. At current usage levels, "full state archival" as specified by CAP-0057 (i.e. evicted entries are deleted from validators) is not a necessary or useful optimization. The space savings of validators and the History Archive would be minimal, and the additional complexity, both from a computational and UX standpoint, do not justify the minimal storage savings. Long term, should usage increase, the tradeoffs introduced by CAP-0057 will make more sense, but that is not the case currently or in the near term future.

The reasoning behind introducing CAP-0057 before it was strictly necessary was to prevent breakage of applications. If applications and users were not prepared for State Archival, it would be very difficult to enable CAP-0057. However, because CAP-0046-12 is implemented and State Archival semantics are already required, upgrading to CAP-0057 should not be breaking, especially since much of the complexity, like proof generation, is handled automatically by RPC endpoints (via RPC's captive-core backend).

While there are no immediate benefits to the "full" state archival introduced in CAP-0057, there are significant short term and long term gains from "partial state archival" introduced in this CAP. Specifically, this CAP will allow for full in-memory Soroban live state caching and automatic restoration of Soroban entries via InvokeHostFunctionOp, to be detailed in a future CAP.

Fees and Limits Changes

Not counting Archive State

Because ledger state is now split into two BucketLists, the BucketList size component for storage fees must be reconsidered. Currently, the total size of the BucketList is used for Soroban related fees. This is frustrating to end users, as classic state dominates the BucketList size such that changes unrelated to Soroban have the most impact towards Soroban related fees. The purpose of the fee is to give back pressure to new writes when the BucketList is large. This should discourage writes (or financially prevent DOS attacks) and give time for State Archival to evict state, reduce the size of the BucketList, thus reducing fees. The issue is, classic entries dominate storage and are not subject to State Archival, so this back pressure system is not effective. In practice, should BucketList size rapidly increase, it is most likely due to classic state, and the only way to lower fees is via network config upgrade.

This CAP changes the BucketList size used in fee calculations to Live Soroban State Size. This means that archived state (i.e. the Hot Archive size) does not impact fees. Long term, it is expected that CAP-0057 will be implemented and the Hot Archive will be dropped from validators. Thus from a network health standpoint, the size of the Hot Archive does not have a significant impact on network sustainability.

Prior to CAP-0057, large Hot Archives do impact network sustainability, specifically with respect to History Archive size, catchup time, and node disk requirements. However, a fee based on Hot Archive size does not seem to address this issue. First, there is currently no way to remove state from the Hot Archive. Only restore operations remove Hot Archive state, but this state is just added back to the Live BucketList. Thus fees would continually increase without any benefit to network sustainability, as no action or system is currently in place to reduce Hot Archive size.

While not improving network sustainability, Hot Archive fees could prevent state based DOS attacks, where state is rapidly added to the network. However, unlike the live BucketList, operations cannot directly write state to the Hot Archive (with the exception of restores, which always decrease Hot Archive size). The rate of eviction (i.e. the rate at which state is added to the Hot Archive) can already be controlled via network config settings. The rate at which state is added to the Live BucketList is also already controlled by network config settings. Thus, enough tools are already in place to prevent state based DOS attacks without the addition of a Hot Archive based fee.

Changes required for CONTRACT_WASM

CAP-0065 and CAP-0066 build on this work to cache all live Soroban state in memory, including CONTRACT_DATA, TTL, and instantiated contract code. These caches open potential DOS vectors with the way CONTRACT_CODE size is metered in protocol 22.

For CONTRACT_DATA and TTL, we cache all live LedgerEntry. This does not present a DOS angle. We meter these entry types based on LedgerEntry size such that we have a 1 to 1 ratio between the bytes that are metered and the bytes that are cached in memory.

For CONTRACT_CODE, this is no longer the case. CAP-0065 caches instantiated modules in memory, which in the worst case could be up to 40x the size of the CONTRACT_CODE LedgerEntry size. This is a significant OOM risk, as sorobanLiveStateTargetSizeBytes must bound the amount of state validators are required to cache in memory. If CONTRACT_CODE xdr size is used, an attacker could upload code that once instantiated could bloat the cache size to sorobanLiveStateTargetSizeBytes * 40, causing an OOM based DOS of the network. To prevent this, we use the instantiated module memory cost model size instead of the CONTRACT_CODE LedgerEntry size.

The only down side is that uploading, restoring, and rent bumping CONTRACT_CODE will be more expensive from both a fee and limits standpoint. The fee increase is not a significant concern. However, when uploading/restoring a new CONTRACT_CODE entry with the new size calculation, writeBytes will be significantly higher than what is actually being written to disk, as we charge for the (much larger) in-memory size instead of the actual on-disk size. This means that we might need to increase txMaxWriteBytes to support uploading larger/more complex contracts despite these write sources not actually being utilized.

In the short term, this does not seem to be an issue, as we can set max contract sizes modestly and have significant buffer wrt write bytes limits. However, this may become an issue in the future. A future protocol may explore splitting limits and fees on write bytes for CONTRACT_CODE. We could charge writeBytes fees based on instantiated size to prevent DOS, but use the actual on-disk size for txMaxWriteBytes and ledgerMaxWriteBytes limits to avoid increasing limits artificially due to under utilization. However, this does not appear necessary for protocol 23.

Expectations for Downstream Systems

Meta changed outlined in CAP-0066. Otherwise, there will be minimal changes with some XDR fields being renamed.

Security Concerns

This introduces no novel security concerns. All state is still maintained by validators and written to History Archives. Splitting state into separate DBs has no inherit risks.

CAP-0065 and CAP-0066 introduce potential DOS vectors wrt sorobanLiveStateTargetSizeBytes addressed above in the fees section.

Test Cases

TBD

Implementation

TBD