PocketLzma

January 15, 2023 · View on GitHub

Windows (MSVC) Linux (GCC) Linux (Clang) MacOS (Clang)

PocketLzma

PocketLzma is a cross-platform singleheader LZMA compression/decompression library for C++11. To use it, all you need is one pocketlzma.hpp file, and you are good to go! The library is able to read data from both files and memory!

PocketLzma is designed to be a mix of a modern yet portable C++ library, only utilizing C++11 to make sure it can be used in projects where the latest versions of C++ are not available or otherwise not allowed to use.

What is LZMA?

LZMA stands for Lempel–Ziv–Markov chain Algorithm, and is an algorithm used for lossless data compression. The algorithm has been developed by Igor Pavlov since the late 90s. The C implementation made by Igor Pavlov is in fact what provides the core functionalty required for PocketLzma to work (with a few modifications to make it cross-platform and single-header compatible).

Documentation

There is a Doxygen generated documentation of PocketLzma that can be found HERE

How to use PocketLzma

PocketLzma is first of all designed to be easy to use! If you want to see a fully working code example, take a look at the pocketlzma_program.cpp file in the root folder of this project. Otherwise: Follow along for some short examples.

First an important note! Since PocketLzma internally uses Igor Pavlov's C code, you are required to #define POCKETLZMA_LZMA_C_DEFINE ONCE (and only once) before including pocketlzma.hpp. The reason for this is to make sure the implementations of the C-code are only included once. In other words: If you include the pocketlzma.hpp file several places in your project, the other places must not use the #define

The File API

PocketLzma has a very simple API for file communication. However, you are free to use something else if you want. PocketLzma doesn't care how you got your data. Example:

std::string path = "./yourFile.txt";

//When you are 100% sure your loading will never fail, you can use this
std::vector<uint8_t> data1 = plz::File::FromFile(path);

//However - I recommend to use this overload instead:
std::vector<uint8_t> data2;
plz::FileStatus fileStatus = plz::File::FromFile(path, data2);
//If something went wrong
if(fileStatus.status() != plz::FileStatus::Code::Ok)
{
    plz::FileStatus::Code statusCode = fileStatus.status(); //PocketLzma status code. Will be useful if errors not causing exceptions happen.
    
    //You may or may not have some error information here
    int code = fileStatus.code();                       //Code returned from the OS in cases where an exception is thrown
    std::string msg = fileStatus.message();             //Message from the OS in cases where an exception is thrown
    std::string exception = fileStatus.exception();     //Exception message from the OS in cases where an exception is thrown
    std::string category = fileStatus.category();       //Error category defined bythe OS in cases where an exception is thrown
}

//You can use memory data directly in PocketLzma, but you can use this if you want to transform them into a byte vector.
std::vector<uint8_t> memoryData;
plz::File::FromMemory(memfiles::_JSON_TEST_OK_HEADER_LZMA, memfiles::_JSON_TEST_OK_HEADER_LZMA_SIZE, memoryData);

//Finally, you can write to files like this
std::string writePath = "./yourOutputFile.txt";
plz::FileStatus fileWriteStatus = plz::File::ToFile(writePath, data2);

Compression

#define POCKETLZMA_LZMA_C_DEFINE
#include "pocketlzma.hpp"

int main()
{
    std::string path = "./../../content/to_compress/from/json_test.json";
    std::vector<uint8_t> data;
    std::vector<uint8_t> compressedData;
    plz::FileStatus fileStatus = plz::File::FromFile(path, data);
    if(fileStatus.status() == plz::FileStatus::Code::Ok)
    {
        plz::PocketLzma p;
        /*!
         *  Possibilities:
         *  Default
         *  Fastest
         *  Fast
         *  GoodCompression
         *  BestCompression
         */
        p.usePreset(plz::Preset::GoodCompression); //Default is used when preset is not set.
        plz::StatusCode status = p.compress(data, compressedData);
        if(status == plz::StatusCode::Ok)
        {
            std::string outputPath = "./../../content/to_compress/to/j.lzma";
            plz::FileStatus writeStatus = plz::File::ToFile(outputPath, compressedData);
            if(writeStatus.status() == plz::FileStatus::Code::Ok)
            {
                //Process completed successfully!
            }
        }
    }
    return 0;
}

Compression (Advanced)

If you are familiar with LZMA, you can use your own compression settings rather than using presets. Making it possible to tune every parameter to get your own balance of speed and compression ratio.

#define POCKETLZMA_LZMA_C_DEFINE
#include "pocketlzma.hpp"

int main()
{
    std::string path = "./../../content/to_compress/from/json_test.json";
    std::vector<uint8_t> data;
    std::vector<uint8_t> compressedData;
    plz::FileStatus fileStatus = plz::File::FromFile(path, data);
    if(fileStatus.status() == plz::FileStatus::Code::Ok)
    {
        plz::Settings settings;
        // These are actual default values used when choosing the Default preset,
        // but shows the parameters that can be tuned
        settings.level                = 5;
        settings.dictionarySize       = 1 << 24;
        settings.literalContextBits   = 3;
        settings.literalPositionBits  = 0;
        settings.positionBits         = 2;
        settings.fastBytes            = 32;

        plz::PocketLzma p {settings}; //You can alternatively use: p.setSettings(settings);
        plz::StatusCode status = p.compress(data, compressedData);
        if(status == plz::StatusCode::Ok)
        {
            std::string outputPath = "./../../content/to_compress/to/j.lzma";
            plz::FileStatus writeStatus = plz::File::ToFile(outputPath, compressedData);
            if(writeStatus.status() == plz::FileStatus::Code::Ok)
            {
                //Process completed successfully!
            }
        }
    }
    return 0;
}

Decompression

#define POCKETLZMA_LZMA_C_DEFINE
#include "pocketlzma.hpp"

int main()
{
    std::string path = "./../../content/to_compress/to/j.lzma";
    std::vector<uint8_t> data;
    std::vector<uint8_t> decompressedData;
    plz::FileStatus fileStatus = plz::File::FromFile(path, data);
    if(fileStatus.status() == plz::FileStatus::Code::Ok)
    {
        //No settings / presets are used during decompression!
        plz::PocketLzma p;
        plz::StatusCode status = p.decompress(data, decompressedData);
        if(status == plz::StatusCode::Ok)
        {
            std::string outputPath = "./../../content/to_compress/from/j.json";
            plz::FileStatus writeStatus = plz::File::ToFile(outputPath, decompressedData);
            if(writeStatus.status() == plz::FileStatus::Code::Ok)
            {
                //Process completed successfully!
            }
        }
    }
    return 0;
}

Decompression using data from memory

You can use data directly from memory (both for compression and decompression), if you please. If you need a program to generate in-memory files, you can use my f2src program to do that job for you.

plz::PocketLzma p;

//Alternative 1
std::vector<uint8_t> decompressed;
plz::StatusCode status = p.decompress(memfiles::_JSON_TEST_OK_HEADER_LZMA, 
                                      memfiles::_JSON_TEST_OK_HEADER_LZMA_SIZE, 
                                      decompressed);

...

//Alternative 2
std::vector<uint8_t> bytes = plz::File::FromMemory(memfiles::_JSON_TEST_OK_HEADER_LZMA, 
                                                   memfiles::_JSON_TEST_OK_HEADER_LZMA_SIZE);
status = p.decompress(bytes, decompressed);

...

//Alternative 3
std::vector<uint8_t> bytes;
plz::File::FromMemory(memfiles::_JSON_TEST_OK_HEADER_LZMA, 
                      memfiles::_JSON_TEST_OK_HEADER_LZMA_SIZE, 
                      bytes);
status = p.decompress(bytes, decompressed);

Benchmarks

Benchmark 1 (v1.0)

Specs:

  • CPU: Intel Core i7-6700 @ 8x 4GHz
  • GPU: GeForce GTX 1080
  • RAM: 15969MiB
  • OS: Linux (Manjaro 20.2 Nibia) - Kernel: x86_64 Linux 4.14.209-1-MANJARO

.json compression benchmark (20 runs)

PresetSize beforeSize afterAverage timeMin. timeMax. time
Fastest70230 bytes3364 bytes2.25587 ms1.99016 ms3.17568 ms
Fast70230 bytes3283 bytes2.29653 ms2.02276 ms3.54542 ms
Default70230 bytes2693 bytes20.7517 ms19.3443 ms25.8784 ms
GoodCompression70230 bytes2485 bytes30.6193 ms28.5697 ms38.3485 ms
BestCompression70230 bytes2451 bytes45.8248 ms43.4147 ms57.934 ms

.json decompression benchmark (20 runs)

Preset (when compressed)Size beforeSize afterAverage timeMin. timeMax. time
Fastest3364 bytes70230 bytes0.502977 ms0.497585 ms0.536352 ms
Fast3283 bytes70230 bytes0.492965 ms0.488002 ms0.50396 ms
Default2693 bytes70230 bytes0.42965 ms0.423914 ms0.448674 ms
GoodCompression2485 bytes70230 bytes0.405111 ms0.402765 ms0.419498 ms
BestCompression2451 bytes70230 bytes0.401403 ms0.400253 ms0.406537 ms

.slp (binary file) compression benchmark (5 runs)

PresetSize beforeSize afterAverage timeMin. timeMax. time
Fastest4145823 bytes702789 bytes253.102 ms232.259 ms330.66 ms
Fast4145823 bytes677754 bytes355.459 ms321.867 ms455.138 ms
Default4145823 bytes572742 bytes1889.8 ms1852.6 ms1993.94 ms
GoodCompression4145823 bytes521168 bytes2888.82 ms2879.99 ms2915.07 ms
BestCompression4145823 bytes520358 bytes3140.49 ms3107.47 ms3168.36 ms

.slp (binary file) decompression benchmark (5 runs)

Preset (when compressed)Size beforeSize afterAverage timeMin. timeMax. time
Fastest702789 bytes4145823 bytes85.5723 ms84.871 ms87.2708 ms
Fast677754 bytes4145823 bytes81.7042 ms81.4525 ms82.052 ms
Default572742 bytes4145823 bytes78.0439 ms77.7491 ms78.7667 ms
GoodCompression521168 bytes4145823 bytes74.8225 ms74.1976 ms76.2949 ms
BestCompression520358 bytes4145823 bytes74.3353 ms74.1313 ms74.5187 ms

Benchmark 2 (v1.0)

Specs:

  • CPU: AMD Ryzen 9 5900X 12-Core @ 24x 3.7GHz
  • GPU: NVIDIA GeForce RTX 3070
  • RAM: 64216MiB
  • OS: Manjaro 22.0.0 Sikaris - Kernel: x86_64 Linux 6.1.1-1-MANJARO

.json compression benchmark (20 runs)

PresetSize beforeSize afterAverage timeMin. timeMax. time
Fastest70230 bytes3364 bytes0.541419 ms0.534969 ms0.570288 ms
Fast70230 bytes3283 bytes0.552374 ms0.548396 ms0.559747 ms
Default70230 bytes2693 bytes6.0338 ms5.98933 ms6.20445 ms
GoodCompression70230 bytes2485 bytes9.04424 ms8.9897 ms9.08903 ms
BestCompression70230 bytes2451 bytes13.1645 ms13.1083 ms13.2443 ms

.json decompression benchmark (20 runs)

Preset (when compressed)Size beforeSize afterAverage timeMin. timeMax. time
Fastest3364 bytes70230 bytes0.182297 ms0.179499 ms0.199928 ms
Fast3283 bytes70230 bytes0.176741 ms0.17497 ms0.187393 ms
Default2693 bytes70230 bytes0.154774 ms0.153528 ms0.165992 ms
GoodCompression2485 bytes70230 bytes0.144384 ms0.142778 ms0.149791 ms
BestCompression2451 bytes70230 bytes0.14039 ms0.13909 ms0.146114 ms

.slp (binary file) compression benchmark (5 runs)

PresetSize beforeSize afterAverage timeMin. timeMax. time
Fastest4145823 bytes702789 bytes83.8579 ms82.4795 ms87.221 ms
Fast4145823 bytes677754 bytes125.257 ms115.714 ms146.505 ms
Default4145823 bytes572742 bytes813.151 ms720.423 ms972.822 ms
GoodCompression4145823 bytes521168 bytes1237.8 ms1158.11 ms1471.32 ms
BestCompression4145823 bytes520358 bytes1313.7 ms1241.61 ms1510.28 ms

.slp (binary file) decompression benchmark (5 runs)

Preset (when compressed)Size beforeSize afterAverage timeMin. timeMax. time
Fastest702789 bytes4145823 bytes34.8025 ms34.5862 ms35.515 ms
Fast677754 bytes4145823 bytes32.8216 ms32.7728 ms32.9863 ms
Default572742 bytes4145823 bytes31.0607 ms30.862 ms31.3336 ms
GoodCompression521168 bytes4145823 bytes29.4726 ms29.4448 ms29.51 ms
BestCompression520358 bytes4145823 bytes29.5008 ms29.468 ms29.6048 ms

Credits

All credits goes to Igor Pavlov, the genius behind the LZMA compression algorithm. He has distributed all his work under Public Domain for anyone to use. PocketLzma uses parts of Igor Pavlov's LZMA related C code in LZMA SDK v19.00.

While PocketLzma goes under the still very permissive BSD-2-Clause License I've created an optional cross-platform amalgamated lzma_c.hpp, which contains a slightly altered version of Igor Pavlov's LZMA implementation for C, and is released under the same Public Domain license to honor Igor's work. This can be found in the extras folder, but keep in mind that this file is not supported in any way and may not co-exist with PocketLzma.