(Concepts-enabled) Functional Abstraction Layer for C++

April 25, 2020 ยท View on GitHub

License Release

(Concepts-enabled) Functional Abstraction Layer for C++

Cefal is a C++20 header-only library with abstractions over basic functional programming concepts (and using C++20 concepts).

It is more a research pet project than a production-ready library (especially keeping in mind it compiles only on GCC/master for now).

Tests exist though and benchmarks as well.

See examples for general idea about what it looks like or check src/dummy.cpp.

Dependencies

  • C++20: Also requires concepts library as well
  • CMake >= 3.13.0

Available typeclasses

Monoid

Has empty and append functions. For sake of performance helpers::SingletonFrom exists that can be used to wrap single element of monoidal container and pass it as right operand to append to avoid extra memory allocations.

Instances

  • basic_types - integral types and std::string
  • std_containers - single socket std:: containers
  • std_optional - std::optional
  • with_functions - any type that has empty and append methods

Foldable

Has foldLeft function.

Instances

  • std_containers - single socket std:: containers
  • std_ranges - std::ranges::views
  • with_functions - any type that has foldLeft or fold_left method

Functor

Has unit and map functions. Also provides innerMap function for Functor of Functors.

Instances

  • from_foldable - types that have instances for Monoid and Foldable
  • std_optional - std::optional
  • std_ranges - std::ranges::views
    • with_functions - any type that has unit and map methods

Monad

Has flatMap function and also is a Functor. Also provides innerFlatMap function for Functor of Monads.

Instances

  • from_foldable - types that have instances for Monoid and Foldable
  • std_optional - std::optional
  • with_functions - any type that has flatMap or flat_map method and also is a Functor

Filterable

Has filter function. Also provides innerFilter function for Functor of Filterables.

Instances

  • from_foldable - types that have instances for Monoid and Foldable. Either SingletonFrom helper or Functor is also required.
  • std_optional - std::optional
  • std_ranges - std::ranges::views
  • with_functions - any type that has filter method

Converter

Has as function. Allows to modify the shape.

Instances

  • from_self - Converter to same type. Doesn't do anything, just returns the same object.
  • from_std_containers - from std::range (i.e. std::containers and range views) to std::containers
  • from_std_optional - from std::optional to any functor+monoid

Usage

All typeclasses can be loaded with cefal/cefal header. No instances are loaded automatically, they need to be loaded on one-by-one basis (cefal/everything.h exists though with all the instances added, but is not recommended to use).

All concepts are in cefal::concepts namespace.

All instances should be implemented in cefal::instances namespace.

All operations are in cefal::ops namespace and can be used either through pipe operator or with currying.

Lvalue vs rvalue

All operations on lvalue operands expect constref arguments of functions, passed to them (except accumulator for foldLeft, which is rvalue).

All operations on rvalue operands can work with rvalue as well.

The only exception is operations on ranges. They are done in compliance with how ranges work and on both lvalue and rvalue expect either constref or ref (from where it is possible to move). Be aware though that moving from ranges operation sometimes can be more expensive than copying due to extensive optimizations compilers could do on ranges. For example, check mutable vs immutable benchmarks for map() on ranges for case when it works with Expensive<int> and converts it to different type. On -O3 level immutable benchmark is faster roughly 2-3 times.

Performance

Due to cefal being mostly a wrapper around std or user implementations - overhead should be minimal.

For std::containers and map/filter operations few non-pure optimizations are in place to provide performance similar to using std algorithms. Cefal also contains Catch2-based benchmarks for std::containers as for something that can be both heavy enough to process and comparable with other implementation (std algorithms).

Ranges-based benchmarks are available as well, but their numbers are not presented below due to being the same across std::ranges::views and cefal::ops.

For benchmarks we use next value types:

  • int - as an example of lightweight type without any extra memory allocations
  • Expensive - custom type that has memory allocation performed in constructor and copy constructor, but can be cheaply moved
  • For Maps of expensive type we use int as key and Expensive as value. There is one exception - extra map2 in map() for Expensive as key and int as value.

Container sizes are not the same for different containers (otherwise it would either take too much time for slow ones or too less for fast ones), so different containers can't be compared, but containers used for cefal and std are the same size:

  • std::vector: 10kk for int and 25k for Expensive
  • std::list: 1kk for int and 25k for Expensive
  • std::deque: 10kk for int and 25k for Expensive
  • std::set: 100k for int and 25k for Expensive
  • std::unordered_set: 100k for int and 25k for Expensive
  • std::map: 100k for int and 25k for Expensive
  • std::unordered_map: 100k for int and 25k for Expensive

Multi versions of sets are also benchmarked, but are similar to single-entry sets and are omitted in results below for brevity.

There are two types of benchmarks:

  • Immutable - initial container is taken by lvalue and is not modified
  • Mutable - Initial container is taken by rvalue and can be modified

Std references:

  • For map() we use std::transform (either to new or to same container). If it is vector and immutable - we reserve() destination as well
  • For filter() we use std::erase_if either on copy of container or on initial container

Map benchmarks transform container to same type (to make it similar between lvalue and rvalue).

Filter benchmarks are also divided by percentage of items that are accepted.

All values are from mean section of catch2 benchmarks in milliseconds (captured on MBP15'2014 with i7).

Std library is from GCC/master (commit 73dd051894b8293d35ea1c436fa408c404b80813, April 1 2020) and all benchmarks are built with -O3.

map() for int

Typevectorlistdequesetunordered_setmapunordered_map
Immutable cefal23.028183.42558.21229.58521.12728.19321.389
Immutable std27.566182.48557.71426.92219.71627.41818.953
Mutable cefal2.9984.1916.87510.8576.68910.9126.491
Mutable std3.0074.2589.540N/AN/AN/AN/A

map() for Expensive

Typevectorlistdequesetunordered_setmapmap 2unordered_map
Immutable cefal60.76665.03261.07772.30774.97876.55576.55374.536
Immutable std67.38365.15161.07675.57374.24277.11390.50472.470
Mutable cefal0.0520.1020.0342.0511.9542.3782.6211.958
Mutable std15.69216.07416.077N/AN/AN/AN/AN/A

filter() for int

Type10%25%50%75%90%
vector
Immutable cefal40.74443.75149.75453.39955.552
Immutable std45.55145.62348.69748.26148.471
Mutable cefal37.46637.47537.41937.51437.774
Mutable std37.24737.44640.52340.47940.574
list
Immutable cefal21.10847.12994.010139.376164.344
Immutable std185.323191.391196.062191.282184.193
Mutable cefal98.67583.86959.21830.94114.719
Mutable std88.46379.78161.18224.53810.800
deque
Immutable cefal47.50553.60664.65776.67079.651
Immutable std88.11386.78986.09786.73883.507
Mutable cefal59.53956.14251.80146.97243.073
Mutable std59.42556.51851.88247.13143.188
set
Immutable cefal4.1427.35313.85620.53825.234
Immutable std24.65423.56322.44821.92220.884
Mutable cefal11.6889.4446.1593.2572.109
Mutable std11.3979.3756.3323.2492.075
unordered_set
Immutable cefal4.4976.82310.94514.87319.134
Immutable std17.76717.24216.95217.33116.105
Mutable cefal9.1917.5834.2942.4111.219
Mutable std9.2117.5894.3372.5311.173
map
Immutable cefal4.3808.13614.58821.91226.258
Immutable std24.81323.84422.64222.02921.045
Mutable cefal11.6429.6346.4363.5912.157
Mutable std11.6739.4766.2783.3452.078
unordered_map
Immutable cefal4.8317.40311.79516.06919.637
Immutable std18.38318.71017.37016.85416.475
Mutable cefal9.7167.9784.6542.5551.308
Mutable std9.6078.0184.5512.4991.230

filter() for Expensive

Type10%25%50%75%90%
vector
Immutable cefal11.62929.25458.91589.071106.378
Immutable std68.323114.938283.82697.74664.673
Mutable cefal34.50056.911157.36638.7978.523
Mutable std34.47956.636156.91038.9538.597
list
Immutable cefal6.20815.70631.76947.63257.195
Immutable std65.37166.36567.93866.41465.897
Mutable cefal40.17381.944241.91146.6289.073
Mutable std32.85328.31720.10610.9785.090
deque
Immutable cefal6.01415.28829.77645.08453.986
Immutable std69.170118.254286.198100.66865.932
Mutable cefal39.75980.341240.85247.4818.812
Mutable std39.69978.960244.60147.2008.718
set
Immutable cefal21.83431.49748.26665.36776.619
Immutable std68.91468.63167.96967.64467.901
Mutable cefal33.82129.16020.65211.2345.362
Mutable std33.87929.12520.67211.2025.342
unordered_set
Immutable cefal21.80231.02147.71778.14577.942
Immutable std68.04767.80168.69469.27967.614
Mutable cefal33.65228.70420.57011.3675.485
Mutable std33.44028.77821.76511.2255.457
map
Immutable cefal7.17618.08834.95453.14563.102
Immutable std83.86687.06082.87183.90685.545
Mutable cefal33.76830.11821.74011.4705.764
Mutable std48.07143.94035.72225.03119.477
unordered_map
Immutable cefal7.50217.51034.28363.61965.634
Immutable std84.98385.64785.59784.45783.882
Mutable cefal35.04330.76521.95511.5995.544
Mutable std50.34644.92636.23325.99919.650

Observations on benchmark results

  • Immutable map() is on par with std::transform
  • For vector-like (vector, list, deque) containers mutable map() for ints is on par, but for move-efficient types it is A LOT faster than std::transform (numbers in table above are not a typo)
  • There is no mutable "single call" std::transform for set-like or associative containers, but mutable map() for all inner types is much faster than immutable versions of both cefal and std
  • For maps - performance is similar as for set-like containers. There is one extra interesting point - std::transform for std::map of Expensive as key works slower than for case when Expensive is value. It doesn't happen for cefal::map() and not reproducible on std::unordered_map or on any filter/erase_if operations.
  • filter() is worse than std::erase_if for mutable lists of both ints and Expensive and for vectors of Expensive in case when almost whole container is accepted
  • filter() is on par with std::erase_if in case of immutable vector of small types and in case of all other mutable containers not mentioned above
  • For immutable containers except vector filter() is either on par or better than std::erase_if. Less elements are accepted - bigger the gap for in favor of filter() (up to 10x in case of 10% elements accepted)
  • Benchmarks for mapping to another inner type also exist in source code (not added here for brevity).
    • For immutable containers they show pretty much the same results (i.e. almost equal between cefal and std).
    • For mutable containers of ints it is also on the same level.
    • Mutable cefal benchmarks for move-effective types though shows better performance than immutable cefal/std on all containers except unordered_set (where it is on par).

As a general conclusion: there are for sure few cases where cefal shows itself worse than direct usage of std algorithm (not tremendously though), but there are also a lot of cases where cefal works faster by 1-2 orders of magnitude (especially in case of move-efficient types) and in remaining cases it is on par with std.

Cefal lacks laziness, but it can be achieved with std::ranges (cefal has partial support for them as Foldable, Functor and Filterable).

Examples

Piped form

std::list<std::vector<int>> result =
    cefal::unit<std::vector>(3) | cefal::ops::map          ([](int x) { return ops::unit<std::vector>(x); })
                                | cefal::ops::innerFilter  ([](int x) { return x % 2; })
                                | cefal::ops::innerFlatMap ([](int x) { return std::vector{x + 1, x + 2}; })
                                | cefal::ops::innerMap     ([](int x) { return x * 3; })
                                | cefal::ops::as<std::list>();
auto rawToResult = cefal::ops::flatMap([](RawResult&& raw){ return maybeGetResult(std::move(raw)); };
std::optional<int> result = maybeGetRawResult() | rawToResult
                                                | cefal::ops::map([](Result&& x) { return x.value(); });

Curried form

auto mapper = cefal::ops::map([](int x) { return x * 3; });
auto result = mapper(cefal::unit<std::vector>(3));
auto left = cefal::unit<std::vector>(3);
auto anotherResult = cefal::ops::map([](int x) { return x * 3; })(left);

Ranges materialization

std::map<int, std::string> source = /*...*/;

std::unordered_map<int, std::string> mapResult =
    source | std::views::filter([](const auto& x) {return x.first % 2; })
           | cefal::ops::as<std::unordered_map>();

std::vector<std::pair<std::string, std::string>> vectorResult =
    source | std::views::transform([](const auto& x) {return std::make_pair(std::to_string(x.first), x.second); })
           | cefal::ops::as<std::vector>();

Custom class support

template <typename T>
class MyClass {
  // ...
};

namespace cefal::instances {
template <typename T>
struct Functor<MyClass<T>> {
  static MyClass<T> unit(T x) {
    MyClass<T> result;
    result.setValue(std::move(x));
    return result;
  }

  static auto map(const MyClass<T>& src, Func&& func) {
    using U = std::invoke_result_T<Func, T>;
    MyClass<U> result;
    result.setValue(func(src.value()));
    return result;
  }
};
}

MyClass<int> from = cefal::ops::unit<MyClass>(42);
MyClass<double> result = from | cefal::ops::map([](int x) -> double { return x * 2.0; });

Custom class support through class methods

template <typename T>
class MyClass {
  static MyClass<T> unit(T x) {
    MyClass<T> result;
    result.setValue(std::move(x));
    return result;
  }

  auto map(Func&& func) {
    using U = std::invoke_result_T<Func, T>;
    MyClass<U> result;
    result.setValue(func(value()));
    return result;
  }
  // ...
};

MyClass<int> from = cefal::ops::unit<MyClass>(42);
MyClass<double> result = from | cefal::ops::map([](int x) -> double { return x * 2.0; });