Benchmarks.md

March 5, 2025 ยท View on GitHub

Benchmarks

TODO: needs update

Benchmarks have been implemented with BenchmarkDotNet.

All CacheManager instances used in the benchmarks have only one cache handle configured, either the Dictionary, System.Runtime or Redis handle.

We are using the same configuration for all benchmarks and running two jobs each, one for x86 and one for x64. Regarding the different platforms, the conclusion is obviously that x64 is always faster than the x86 platform, but x64 consumes slightly more memory of course.

BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.3194)
12th Gen Intel Core i7-12700KF, 1 CPU, 20 logical and 12 physical cores
.NET SDK 9.0.200
  [Host]     : .NET 8.0.13 (8.0.1325.6609), X64 RyuJIT AVX2
  Job-BIBDFC : .NET 8.0.13 (8.0.1325.6609), X64 RyuJIT AVX2

IterationCount=10  LaunchCount=1  WarmupCount=2

Add

Adding one item per run

Redis will be a lot slower in this scenario because CacheManager waits for the response to be able to return the bool value if the key has been added or not. In general, it is good to see how fast the Dictionary handle is compared to the System.Runtime one. One thing you cannot see here is that also the memory footprint of the Dictionary handle is much lower.

MethodMeanErrorStdDevRatioRatioSDGen0Gen1AllocatedAlloc Ratio
Dictionary121.0 ns2.00 ns1.04 ns1.000.010.0153-200 B1.00
Runtime637.2 ns28.46 ns16.94 ns5.270.140.23840.00103120 B15.60
MsMemory198.8 ns4.59 ns3.04 ns1.640.030.0260-340 B1.70
Redis105,395.6 ns3,758.09 ns2,485.74 ns871.2020.86--1256 B6.28

Adding one item per run with using region

MethodMeanErrorStdDevRatioRatioSDGen0AllocatedAlloc Ratio
Dictionary144.2 ns4.65 ns3.07 ns1.000.030.0231304 B1.00
Runtime342.6 ns18.01 ns11.91 ns2.380.090.0610800 B2.63
MsMemory164.1 ns5.74 ns3.80 ns1.140.030.0231304 B1.00
Redis139,200.0 ns4,863.92 ns2,894.44 ns966.0127.25-1528 B5.03

Put

Put 1 item per run Redis is as fast as the other handles in this scenario because CacheManager uses fire and forget for those operations. For Put it doesn't matter to know if the item has been added or updated...

MethodMeanErrorStdDevRatioRatioSDGen0Gen1AllocatedAlloc Ratio
Dictionary95.01 ns1.888 ns1.249 ns1.000.020.0122-160 B1.00
Runtime887.35 ns19.929 ns13.181 ns9.340.180.42630.00955576 B34.85
MsMemory172.40 ns5.030 ns3.327 ns1.810.040.0336-440 B2.75
Redis4,136.37 ns301.420 ns199.371 ns43.542.070.08390.06101095 B6.84

Get

Get 1 item per run With Get operations we can clearly see how much faster an in-memory cache is, compared to the distributed variant. That's why it makes so much sense to use CacheManager with a first and secondary cache layer.

MethodMeanErrorStdDevRatioRatioSDGen0AllocatedAlloc Ratio
Dictionary34.97 ns0.532 ns0.317 ns1.000.01--NA
Runtime179.45 ns2.104 ns1.392 ns5.130.060.0153200 BNA
MsMemory75.12 ns0.338 ns0.201 ns2.150.02--NA
Redis75,534.71 ns5,303.351 ns3,507.838 ns2,160.1297.470.12212008 BNA

Serializer comparison

For this, I only used the bare serializer without the cache layer overhead (e.g. Redis) which could yield wrong results. Each single performance run does 1000 iterations of serializing and deserializing the same object. Object structure was the following:

{
	"L" : 1671986962,
	"S" : "1625c0a0-86ce-4fd5-9047-cf2fb1d145b2",
	"SList" : ["98a62a89-f3e9-49d7-93ad-a4295b21c1a1", "47a86f42-64b0-4e6d-9f18-ecb20abff2a3", "7de26dfc-57a5-4f16-b421-8999b73c9afb", "e29a8f8a-feb8-4f3f-9825-78c067215339", "5b2e1923-8a76-4f39-9366-4700c7d0d408", "febea78f-ca5e-49d6-99c9-18738e4fb36f", "7c87b429-e931-4f1a-a59a-433504c87a1c", "bf288ff7-e6c0-4df1-bfcf-677ff31cdf45", "9b7fcd6c-45ee-4584-98b6-b30d32e52f72", "2729610c-d6ce-4960-b83b-b5fd4230cc7e"],
	"OList" : [{
			"Id" : 1210151618,
			"Val" : "6d2871c9-c5f8-44b1-bad9-4eba68683510"
		}, {
			"Id" : 1171177179,
			"Val" : "6b12cd3f-2726-4bf9-a25c-35533de3910c"
		}, {
			"Id" : 1676910093,
			"Val" : "66f52534-92f3-4ef4-b555-48a993a9df7a"
		}, {
			"Id" : 977965209,
			"Val" : "80a20081-a2a5-4dcc-8d07-162f697588b4"
		}, {
			"Id" : 2075961031,
			"Val" : "35f8710a-64e5-481d-9f18-899c65abd675"
		}, {
			"Id" : 328057441,
			"Val" : "d17277e2-ca25-42b1-a4b4-efc00deef358"
		}, {
			"Id" : 2046696720,
			"Val" : "4fa32d5e-f770-4d44-a55b-f6479633839c"
		}, {
			"Id" : 422544189,
			"Val" : "de39c21e-8cb3-4f5c-bf5c-a3d228bc4c25"
		}, {
			"Id" : 1887998603,
			"Val" : "22b00459-7820-46a6-8514-10e901810bbd"
		}, {
			"Id" : 852015288,
			"Val" : "09cc3bd8-da23-42cb-b700-02ec461beb3f"
		}
	]
}

The values are randomly generated the object has one list of strings (Guids) and a list of child objects with an integer and string. Pretty simple but large enough to analyze the performance.

Results:

MethodMeanErrorStdDevRatioRatioSDGen0Gen1AllocatedAlloc Ratio
JsonSerializer83.21 us1.163 us0.769 us1.000.0114.89262.4414191.16 KB1.00
JsonGzSerializer346.37 us4.785 us3.165 us4.160.0521.97272.9297280.75 KB1.47
ProtoBufSerializer39.18 us0.701 us0.463 us0.470.018.60601.0986110.7 KB0.58
BondBinarySerializer19.03 us0.398 us0.263 us0.230.004.73020.671460.53 KB0.32
BondFastBinarySerializer19.42 us0.431 us0.285 us0.230.004.76070.732460.84 KB0.32
BondSimpleJsonSerializer63.56 us1.132 us0.748 us0.760.0111.96291.9531153.35 KB0.80

As expected the protobuf serialization outperforms everything else by a huge margin! The compression overhead of the JsonGz serializer seems to be pretty large and may have some potential for optimizations...