Overlaybd

March 24, 2026 · View on GitHub

logo

Overlaybd (overlay block device) is a novel layering block-level image format, which is design for container, secure container and applicable to virtual machine. And it is an open-source implementation of paper DADI: Block-Level Image Service for Agile and Elastic Application Deployment. USENIX ATC'20".

Scaling up Without Slowing Down: Accelerating Pod Start Time. KubeCon+CloudNativeCon Europe 2024

Overlaybd is based on PhotonLibOS, which is a high-efficiency LibOS framework.

Overlaybd has 2 core component:

  • Overlaybd is a block-device based image format, provideing a merged view of a sequence of block-based layers as a virtual block device. The LBA lookup algorithm employs a linearized B+ tree and AVX-512 to optimize performance, significantly accelerating search speed up to 10X. Lookup Performance

  • Zfile is a compression file format which support seekalbe online decompression.

This repository is an implementation of overlaybd based on TCMU.

Overlaybd can be used as the storage backend of Accelerated Container Image, which is a solution of remote container image by fetching image data on-demand without downloading and unpacking the whole image before the container starts.

Benefits from the universality of block-device, overlaybd is also a widely applicable image format for most runtime, including qemu/kvm and any other runtime supporting block or scsi api.

Overlaybd is a non-core sub-project of containerd.

Setup

System Requirements

Overlaybd provides virtual block devices through TCMU, so the TCMU kernel module is required. TCMU is implemented in the Linux kernel and supported by most Linux distributions.

Check and load the target_core_user module.

modprobe target_core_user

Install From RPM/DEB

You may download our RPM/DEB packages form Release and install.

The binaries are install to /opt/overlaybd/bin/.

Run /opt/overlaybd/bin/overlaybd-tcmu and the log is stored in /var/log/overlaybd.log.

It is better to run overlaybd-tcmu as a service so that it can be restarted after unexpected crashes.

Build From Source

Requirements

To build overlaybd from source code, the following dependencies are required:

  • CMake >= 3.14

  • gcc/g++ >= 7

  • Libaio, libcurl, libnl3, glib2 and openssl runtime and development libraries.

    • CentOS 7/Fedora: sudo yum install libaio-devel libcurl-devel openssl-devel libnl3-devel libzstd-static e2fsprogs-devel
    • CentOS 8: sudo yum install libaio-devel libcurl-devel openssl-devel libnl3-devel libzstd-devel e2fsprogs-devel
    • Debian/Ubuntu: sudo apt install libcurl4-openssl-dev libssl-dev libaio-dev libnl-3-dev libnl-genl-3-dev libgflags-dev libzstd-dev libext2fs-dev pkg-config automake libtool # libgtest-dev // for test
    • Mariner/AzureLinux: sudo yum install libaio-devel libcurl-devel openssl-devel libnl3-devel e2fsprogs-devel glibc-devel libzstd-devel binutils ca-certificates-microsoft build-essential

Build

You need git to checkout the source code:

git clone https://github.com/containerd/overlaybd.git
cd overlaybd
git submodule update --init

The whole project is managed by CMake. Binaries and resource files will be installed to /opt/overlaybd/.

mkdir build
cd build
cmake .. # -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=true -DBUILD_TESTING=true
make -j
sudo make install

Considering some libcurl and libopenssl has API changes, if want to build a make-sured compatible version libcurl and openssl, and link to executable as static library.

Noticed that building libcurl and openssl depends on autoconf automake and libtool.

cmake -D BUILD_CURL_FROM_SOURCE=1 ..

If you want to use the original libext2fs instead of our customized libext2fs.

cmake -D ORIGIN_EXT2FS=1 ..

For more information about ORIGIN_EXT2FS go to USERSPACE_CONVERTOR.

If you want to use DSA hardware to accelerate CRC calculation.

cmake -D ENABLE_DSA=1 ..

If you want to use avx512 to accelerate CRC calculation.

cmake -D ENABLE_ISAL=1 ..

If you want to use QAT to accelerate compression/decompression.

cmake -D ENABLE_QAT=1 ..

For more information go to overlaybd/src/overlaybd/zfile/README.md.

Finally, setup a systemd service for overlaybd-tcmu backstore.

sudo systemctl enable /opt/overlaybd/overlaybd-tcmu.service
sudo systemctl start overlaybd-tcmu

Configuration

overlaybd config

Default configure file overlaybd.json is installed to /etc/overlaybd/.

{
    "logConfig": {
        "logLevel": 1,
        "logPath": "/var/log/overlaybd.log"
    },
    "cacheConfig": {
        "cacheType": "file",
        "cacheDir": "/opt/overlaybd/registry_cache",
        "cacheSizeGB": 4
    },
    "gzipCacheConfig": {
        "enable": true,
        "cacheDir": "/opt/overlaybd/gzip_cache",
        "cacheSizeGB": 4
    },
    "credentialConfig": {
        "mode": "file",
        "path": "/opt/overlaybd/cred.json"
    },
    "ioEngine": 0,
    "download": {
        "enable": true,
        "delay": 600,
        "delayExtra": 30,
        "maxMBps": 100
    },
    "p2pConfig": {
        "enable": false,
        "address": "localhost:19145/dadip2p"
    },
    "exporterConfig": {
        "enable": false,
        "uriPrefix": "/metrics",
        "port": 9863,
        "updateInterval": 60000000
    },
    "enableAudit": true,
    "auditPath": "/var/log/overlaybd-audit.log",
    "serviceConfig": {
        "enable": false,
        "address": "http://127.0.0.1:9862"
    }
}
FieldDescription
logConfig.logLevelThe log level for log file, 0 - DEBUG, 1 - INFO, 2 - WARN, 3 - ERROR
logConfig.logPathThe path for log file, /var/log/overlaybd.log is the default value.
logConfig.logSizeMBThe size limit for log file, in MB, 10 is default (10 MB).
logConfig.logRotateNumThe rotate number for log file, 3 is default.
ioEngineIO engine used to open local files: psync 0, libaio 1, posix aio 2.
cacheConfig.cacheTypeCache type used, file, ocf and download are supported.
cacheConfig.cacheDirThe cache directory for remote image data.
cacheConfig.cacheSizeGBThe max size of cache, in GB.
cacheConfig.refillSizeThe refill size from source, in byte. 262144 is default (256 KB).
gzipCacheConfig.enableWhether decompressed gzip file cache is enabled or not.
gzipCacheConfig.cacheDirThe cache directory for decompressed gzip data.
gzipCacheConfig.cacheSizeGBThe max size of cache, in GB.
gzipCacheConfig.refillSizeThe refill size from source, in byte. 262144 is default (256 KB).
credentialFilePath(legacy)The credential used for fetching images on registry. /opt/overlaybd/cred.json is the default value.
credentialConfig.modeAuthentication mode for lazy-loading.
- file means reading credential from credentialConfig.path.
- http means sending an http request to credentialConfig.path
- https means sending an https request to credentialConfig.path, with optional client certificate authentication and CA pinning
credentialConfig.pathcredential file path or url which is determined by mode
credentialConfig.client_cert_pathOptional. Path to the client certificate file (https mode). May contain the private key in the same PEM file.
credentialConfig.client_key_pathOptional. Path to the client private key file (https mode). Only needed when the key is separate from the certificate.
credentialConfig.server_ca_pathOptional. Path to the CA certificate used to verify the server (https mode). If omitted, the system CA bundle is used. When set, only this CA file is trusted.
download.enableWhether background downloading is enabled or not.
download.delayThe seconds waiting to start downloading task after the overlaybd device launched.
download.delayExtraA random extra delay is attached to delay, avoiding too many tasks started at the same time.
download.maxMBpsThe speed limit in MB/s for a downloading task.
download.blockSizeThe download block size from source, in byte. 262144 is default (256 KB).
p2pConfig.enableWhether p2p proxy is enabled or not.
p2pConfig.addressThe proxy for p2p download, the format is localhost:<P2PConfig.Port>/<P2PConfig.APIKey>, depending on dadip2p.yaml
exporterConfig.enablewhether or not create a server to show Prometheus metrics.
exporterConfig.uriPrefixURI prefix for export metrics.
exporterConfig.portport for http server to show metrics.
exporterConfig.updateIntervalTime interval to update metrics in microseconds.
enableAuditEnable audit or not.
enableThreadEnable overlaybd device run in seprate thread or not. Note cacheType should be ocf. false is default.
auditPathThe path for audit file, /var/log/overlaybd-audit.log is the default value.
registryFsVersionregistry client version, 'v1' libcurl based, 'v2' is photon http based. 'v2' is the default value.
prefetchConfig.concurrencyPrefetch concurrency for reloading trace, 16 is default
certConfig.certFileThe path for SSL/TLS client certificate file
certConfig.keyFileThe path for SSL/TLS client key file
userAgentcustomized userAgent to identify HTTP request. default value is package version like 'overlaybd/1.1.14-6c449832'
serviceConfig.enableEnable live snapshot API service, false is default.
serviceConfig.addressAPI service listening address, default http://127.0.0.1:9862.

NOTE: download is the config for background downloading. After an overlaybd device is lauched, a background task will be running to fetch the whole blobs into local directories. After downloading, I/O requests are directed to local files. Unlike other options, download config is reloaded when a device launching.

credential config

Important: The corresponding credential has to be set before launching devices, if the registry is not public.

Credentials are reloaded when authentication is required. Credentials have to be updated before expiration if temporary credential is used, otherwise overlaybd keeps reloading until a valid credential is set.

Overlaybd supports serveral credential mode. Here are some example credentialConfig field.

  • mode file

    the credentialConfig.path should be similar to '.docker/config.json' like this:

#### /etc/overlaybd/config.json ####
{
  "logLevel": 1,
  "logPath": "/var/log/overlaybd.log",
  ...
  "credentialConfig": {
      "mode": "file",
      "path": "/opt/overlaybd/cred.json"
    },
  ...
}
#### /opt/overlaybd/cred.json ####
{
  "auths": {
    "hub.docker.com": {
      "username": "username",
      "password": "password"
    },
    "hub.docker.com/hello/world": {
      "auth": "dXNlcm5hbWU6cGFzc3dvcmQK"
    }
  }
}
  • mode http

    the credentialConfig.path should be a server listening address implemented by developers and can reply to credential information.

#### /etc/overlaybd/config.json ####
{
  "logLevel": 1,
  "logPath": "/var/log/overlaybd.log",
  ...
  "credentialConfig": {
      "mode": "http",
      "path": "localhost:19876/auth"
    },
  ...
}

overlaybd will send http request to the server with remote_url like this:

GET "localhost:19876/auth?remote_url=https://hub.docker.com/v2/overlaybd/ubuntu/blobs/sha256:47e63559a8487efb55b2f1ccea9cfc04110a185c49785fdf1329d1ea462ce5f0" the server response should be formatted as follows:

{
  "traceId": "${trace_id}"
  "success": true or false
  "data": {
    "auths": {
      "hub.docker.com": {
        "username": "username",
        "password": "password"
      }
    }
  }
}

we write a sample http server in test/simple_auth_server.cpp

  • mode https

    the credentialConfig.path should be an HTTPS server listening address. Unlike http mode, the https:// scheme prefix must be included in the path (e.g. https://localhost:19876/auth). The optional client_cert_path/client_key_path fields enable client certificate authentication, and server_ca_path pins trust to a specific CA. For a local auth server, providing all three fields secures communication exclusively with that server (mutual TLS).

#### /etc/overlaybd/config.json ####
{
  "logLevel": 1,
  "logPath": "/var/log/overlaybd.log",
  ...
  "credentialConfig": {
      "mode": "https",
      "path": "https://localhost:19876/auth",
      "client_cert_path": "/etc/overlaybd/client.crt",
      "client_key_path": "/etc/overlaybd/client.key",
      "server_ca_path": "/etc/overlaybd/ca.crt"
    },
  ...
}

overlaybd will send an https request with mTLS to the server with remote_url like this:

GET "https://localhost:19876/auth?remote_url=https://hub.docker.com/v2/overlaybd/ubuntu/blobs/sha256:47e63559a8487efb55b2f1ccea9cfc04110a185c49785fdf1329d1ea462ce5f0" the server response format is the same as the http mode.

All three TLS fields are optional and independently configured:

  • client_cert_path sets the client certificate. If the PEM file also contains the private key, client_key_path can be omitted.
  • client_key_path sets the client private key. Only needed when the key is in a separate file from the certificate.
  • If server_ca_path is omitted, the system CA bundle is used to verify the server certificate. When server_ca_path is set, only the specified CA file is used — the system CA bundle is not consulted.

Usage

Use with containerd

Please install overlaybd and refer to Accelerated Container Image. Overlaybd is well integrated with containerd and easy to use.

Standalone Usage

For other scenarios, users can use overlaybd manually. Overlaybd works as a backing store of TCMU, so users can run overlaybd image by interacting with configfs.

Config file

A config file is required to describe an overlaybd image, only local image and registry image are supported. Here is a sample json config file:

{
      "repoBlobUrl": "https://obd.cr.aliyuncs.com/v2/overlaybd/sample/blobs",
      "lowers" : [
          {
              "file" : "/opt/overlaybd/layer0"
          },
          {
              "dir": "/var/lib/containerd/root/io.containerd.snapshotter.v1.overlayfs/snapshots/1000",
              "digest": "sha256:e3b0d67cfa3a37dfed187badc7766e3db64d492c4db2dc4260997b41af1b28f3",
              "size": 43446424
          }
      ],
      "resultFile": "/home/overlaybd/1/result"
}
FieldDescription
repoBlobUrlthe url of the repository blobs of the remote image. It is required for a registry image.
lowersa list describing the lower layers of the image in bottom-upper order.
fileit means the corresponding layer is a local file. if a local file is used, other options are not needed.
dirit means the corresponding layer will be stored in this directory after downloading.
digest and sizethe digest and size of a remote layer. It is required for a remote layer.
resultFilethe file for saving the failure reasons. If a device is successfully lauched, success is writen into the file, otherwise, the failure s reported by this file.

Start up

Here is an example to start up an overlaybd image. First, create the overlaybd tcmu device.

mkdir -p /sys/kernel/config/target/core/user_1/vol1
echo -n dev_config=overlaybd//root/config.v1.json > /sys/kernel/config/target/core/user_1/vol1/control
echo -n 1 > /sys/kernel/config/target/core/user_1/vol1/enable

Then, create a tcm loop device.

mkdir -p /sys/kernel/config/target/loopback/naa.123456789abcdef/tpgt_1/lun/lun_0
echo -n "naa.123456789abcdef" > /sys/kernel/config/target/loopback/naa.123456789abcdef/tpgt_1/nexus
ln -s /sys/kernel/config/target/core/user_1/vol1 /sys/kernel/config/target/loopback/naa.123456789abcdef/tpgt_1/lun/lun_0/vol1

Then a block device /dev/sdX is generated, overlaybd image can be used locally. Furthermore, overlaybd device can be used on remote hosts by iscsi.

Clean up

Just remove the files and directories in configfs in reverse order.

Writable layer

Overlaybd provides a log-structured writable layer and a sprase-file writable layer. Log-structured layer is append only and converts all writes into sequential writes so that the image build/convert process is usually faster. Sparse-file writable layer is more suitable for container rutime.

Use overlaybd-create to create a writable layer.

  /opt/overlaybd/bin/overlaybd-create ${data_file} ${index_file} ${virtual size}

use -s to for creating sparse-file writable layer. The upper option in overlaybd config file must be set to use a writable layer. Only one writable layer is avialable and it always workes as the top layer.Example:

{
    "repoBlobUrl": ...,
    "lowers" : [
        ...
    ],
    "upper": {
        "index": "${index_file}",
        "data": "${data_file}"
    },
    "resultFile": "/home/overlaybd/1/result"
}

If upper is set, the overlaybd device is launched as a writable device. The differences produced by data writing are stored in the index and data files ofupper.

After writing data and destroying the device, overlaybd-commit command is required to excute to commit the layer into a read-only layer and can be used asa lower layer later.

/opt/overlaybd/bin/overlaybd-commit ${data_file} ${index_file} ${commit_file}

At last, compression may be needed.

/opt/overlaybd/bin/overlaybd-zfile ${commit_file} ${zfile}

The zfile can be used as lower layer with online decompression.

Live Snapshot

Overlaybd supports creating live snapshots without stopping the device. This feature allows you to capture the current state of a writable layer and stack a new writable layer on top.

Device ID

To use the live snapshot feature, you need to specify a device ID when creating the overlaybd device. The device ID is appended to the config path with a semicolon separator:

echo -n dev_config=overlaybd//root/config.v1.json;123 > /sys/kernel/config/target/core/user_1/vol1/control

Enable API Service

Add the following to your overlaybd.json:

"serviceConfig": {
    "enable": true,
    "address": "http://127.0.0.1:9862"
}

Create Snapshot

Send an HTTP POST request to the /snapshot endpoint:

curl -X POST "http://127.0.0.1:9862/snapshot?dev_id=123&config=/path/to/new_config.json"

The response will be in JSON format:

{
    "success": true,
    "message": "Snapshot created successfully"
}

New Config Format

The new config file should include the current upper layer as the last lower layer:

{
    "lowers": [
        {
            "file": "/opt/overlaybd/layer0"
        },
        {
            "file": "/path/to/current_upper_data.lsmt"
        }
    ],
    "upper": {
        "index": "/path/to/new_upper_index.lsmt",
        "data": "/path/to/new_upper_data.lsmt"
    }
}

Note: The new upper layer must be different from the old upper layer.

Kernel module

DADI_kmod is a kernel module of overlaybd. It can make local overlaybd-format files as a loop device or device-mapper.

Contributing

Welcome to contribute! CONTRIBUTING

Licenses

Overlaybd is released under the Apache License, Version 2.0.