Benchmarks

December 4, 2022 ยท View on GitHub

These are benchmark tests for sake, pyinfra, and Ansible.

I benchmark the following:

  1. Running a raw shell command
  2. Running commands using modules

Summary

sake is between 6 and 8 faster than pyinfra and between 4 and 18 times faster than Ansible, depending on the number of hosts.

Ping

time time

Modules

time time

Table of Contents

Implementation

Implementation details for each software can be found here:

I've made the following optimizations:

  • all disable host checking
  • all run in parallel with maximum number of forks
  • pipelining is set to true for Ansible
  • gather facts is disabled for Ansible

Requirements

Instructions

# Start 500 containers
./mock-ssh.sh

# Start 5 containers
# ./mock-ssh.sh 5

# Run benchmark tests, this will generate csv and images in the results directory
./benchmark.sh

# Run once per test and target only sake
# ./benchmark.sh -r 1 -s sake

# Generate graphs and open browser page with graphs
# python3 graph.py --show

# Update this README with the latest results
./update-readme.sh

Note, the mock-ssh.sh script starts 500 containers locally with a port range set between 10000 and 10500, so if you get a port collision, then try to find a port range with 500 available ports (to see open ports, run netstat -ntlp) and modify all relevant files.

Information

I ran the tests with the following machine:

  • OS: Debian GNU/Linux bookworm/sid x86_64
  • CPU: Intel i9-9900K (16) @ 5.000GHz
  • Kernel: 6.0.0-2-amd64
  • Memory: 32GB
  • Shell: zsh 5.9

The software I used:

  • sake: 0.12.1
  • pyinfra: 2.5.1
  • ansible: core 2.13.4, python 3.10.7
  • ssh: OpenSSH_9.0p1 Debian-1+b1, OpenSSL 3.0.5 5 Jul 2022
  • docker: 20.10.17
  • datamash: 1.7
  • gnu time: 1.9-0.2
  • bash: 5.1.16(1)
  • python: 3.10.7

The benchmarks are generated by running each test 10 times and taking an average.

  • time: The mean elapsed real (wall clock) time (seconds) used by the process
  • cpu: Percentage of the CPU that this job got ((user + sys time) / tot time
  • mem: Maximum memory (Megabyte) usage of the process during its lifetime

Notes

Before we look at the results, it's important to consider the following:

  • Ansible and pyinfra use the Python runtime whereas sake uses Go
  • The Go standard SSH library is faster than OpenSSH (which Ansible uses) and Paramiko (which pyinfra uses)
  • Ansible and pyinfra provides idempotency for modules and thus have to execute a lot more code (both on the control plane and on the hosts)

To understand how each software works let's look at program execution:

  1. sake

    1. Parse config files
    2. Establish connection to hosts in parallel
    3. Send out shell commands to be executed on host in parallel
  2. pyinfra:

    1. Parse config files
    2. Connect to servers serially and gathers facts
    3. Do work locally using the facts and figure out which operations to perform on hosts
    4. Perform operations on hosts in parallel by sending out shell commands
  3. Ansible: do work on a remote machine, get back info, send out shell command

    1. Parse config files
    2. Connect to servers in parallel
    3. Send over Ansible modules to hosts in parallel
    4. Execute (via Python) Ansible modules on hosts in parallel
    5. Remove previously copied Ansible resources

The big difference between Ansible and pyinfra is that pyinfras does its work on the local machine and sends out shell commands to configure servers, whereas Ansible does the work on the host machines using Python. Thus Ansible requires Python (for modules) to be installed on the remote hosts, whereas pyinfra only requires Python on the control plane.

Note, you could speed up Ansible by using Mitogen, but Mitogens Ansible support is limited to Ansible < 2.10 (latest stable Ansible is at 2.13.4).

Results

Complete benchmark results can be found here.

Test Case 1

This is the test case where we run 1 raw shell command.

time time time

Elapsed Time (seconds)

namesakepyinfraansible
10.1430.8880.602
30.1250.9480.621
50.1310.9510.637
80.1570.9720.671
100.1370.9680.701
250.1581.1170.957
500.1751.3641.419
1000.3201.8792.463
2000.5592.9144.308
3000.8264.0506.240
4001.1125.1488.137
5001.4006.33210.152

CPU (%)

namesakepyinfraansible
1168986
3198795
52887103
83490113
103492120
254896183
5048103236
10053113311
20052126332
30051136357
40049144361
50049152377

Memory (MB)

namesakepyinfraansible
1155655
3165655
5185655
8215856
10215855
25256055
50276156
100296358
200406861
300487265
400597670
500668076

Test Case 2

This is the test case where we run the following commands:

  1. Install htop
  2. Add a user
  3. Add a file
  4. Copy a file

Note the following:

  • After the first command is ran, the subsequent commands won't do anything since the user and files already exists, so all the tasks are idempotent (even for sake)
  • Ansible and pyinfra provide robust modules that handle a lot more edge-cases (and are prettier), whereas the ad-hoc written sake tasks only handles the basic cases (if not existing, add)

time time time

Elapsed Time (seconds)

namesakepyinfraansible
10.1561.5681.174
30.1591.4871.210
50.1751.4561.268
80.2031.4891.436
100.2461.4891.535
250.2742.1802.742
500.4392.9824.717
1000.6644.7118.947
2001.0068.37117.410
3001.42512.10826.110
4001.91115.79734.888
5002.44319.41844.075

CPU (%)

namesakepyinfraansible
1145467
3276586
53970104
85374124
106176138
257077236
5011186258
10016895305
200147101324
300138105339
400135109352
500157112365

Memory (MB)

namesakepyinfraansible
1155555
3195755
5205756
8215855
10225955
25256156
50296356
100346759
200457463
300558068
400668973
500769679