Simple C++ Serialization & Reflection.

December 27, 2022 · View on GitHub

Simple C++ Serialization & Reflection.

Cista++ is a simple, open source (MIT license) C++17 compatible way of (de-)serializing C++ data structures.

Single header - no dependencies. No macros. No source code generation.

  • Raw performance - use your native structs. Supports modification/resizing of deserialized data!
  • Supports complex and cyclic data structures including cyclic references, recursive data structures, etc.
  • Save 50% memory: serialize directly to the filesystem if needed, no intermediate buffer required.
  • Fuzzing-checked though continuous fuzzing using LLVMs LibFuzzer.
  • Comes with a serializable high-performance hash map and hash set implementation based on Google's Swiss Table.
  • Reduce boilerplate code: automatic derivation of hash and equality functions.
  • Built-in optional automatic data structure versioning through recursive type hashing.
  • Optional check sum to prevent deserialization of corrupt data.
  • Compatible with Clang, GCC, and MSVC

The underlying reflection mechanism can be used in other ways, too!

Examples:

Download the latest release and try it out.

Simple example writing to a buffer:

namespace data = cista::raw;
struct my_struct {  // Define your struct.
  int a_{0};
  struct inner {
      data::string b_;
  } j;
};

std::vector<unsigned char> buf;
{  // Serialize.
  my_struct obj{1, {data::string{"test"}}};
  buf = cista::serialize(obj);
}

// Deserialize.
auto deserialized = cista::deserialize<my_struct>(buf);
assert(deserialized->j.b_ == data::string{"test"});

Advanced example writing a hash map to a memory mapped file:

namespace data = cista::offset;
constexpr auto const MODE =  // opt. versioning + check sum
    cista::mode::WITH_VERSION | cista::mode::WITH_INTEGRITY;

struct pos { int x, y; };
using pos_map =  // Automatic deduction of hash & equality
    data::hash_map<data::vector<pos>,
                   data::hash_set<data::string>>;

{  // Serialize.
  auto positions =
      pos_map{{{{1, 2}, {3, 4}}, {"hello", "cista"}},
              {{{5, 6}, {7, 8}}, {"hello", "world"}}};
  cista::buf mmap{cista::mmap{"data"}};
  cista::serialize<MODE>(mmap, positions);
}

// Deserialize.
auto b = cista::mmap("data", cista::mmap::protection::READ);
auto positions = cista::deserialize<pos_map, MODE>(b);

Advanced example showing support for non-aggregate types like derived classes or classes with custom constructors:

namespace data = cista::offset;
constexpr auto MODE = cista::mode::WITH_VERSION;

struct parent {
  parent() = default;
  explicit parent(int a) : x_{a}, y_{a} {}
  auto cista_members() { return std::tie(x_, y_); }
  int x_, y_;
};
struct child : parent {
  child() = default;
  explicit child(int a) : parent{a}, z_{a} {}
  auto cista_members() {
    return std::tie(*static_cast<parent*>(this), z_);
  }
  int z_;
};

/*
 * Automatically defaulted for you:
 *   - de/serialization
 *   - hashing (use child in hash containers)
 *   - equality comparison
 *   - data structure version ("type hash")
 */
using t = data::hash_map<child, int>;

// ... usage, serialization as in the previous examples

Benchmarks

Have a look at the benchmark repository for more details.

LibrarySerializeDeserializeFast DeserializeTraverseDeserialize & TraverseSize
Cap’n Proto105 ms0.002 ms0.0 ms356 ms353 ms50.5M
cereal239 ms197.000 ms-125 ms322 ms37.8M
Cista++ offset72 ms0.053 ms0.0 ms132 ms132 ms25.3M
Cista++ raw3555 ms68.900 ms21.5 ms112 ms133 ms176.4M
Flatbuffers2349 ms15.400 ms0.0 ms136 ms133 ms63.0M

Use Cases

Reader and writer should have the same pointer width. Loading data on systems with a different byte order (endianess) is supported. Examples:

  • Asset loading for all kinds of applications (i.e. game assets, GIS data, large graphs, etc.)
  • Transferring data over network
  • shared memory applications

Currently, only C++17 software can read/write data. But it should be possible to generate accessors for other programming languages, too.

Alternatives

If you need to be compatible with other programming languages or require protocol evolution (downward compatibility) you should look for another solution:

Documentation

Contribute

Feel free to contribute (bug reports, pull requests, etc.)!