README.org

July 18, 2017 · View on GitHub

basho_bench

Build Status: [[https://travis-ci.org/mrallen1/basho_bench.svg]]

** Changes from upstream

All of the old built in drivers are gone (which makes it nice and clean to benchmark your not-riak project.)
You need to use rebar3 to compile the project.

** Overview

[[http://docs.basho.com/riak/latest/ops/building/benchmarking/][Additional documentation on docs.basho.com]]

Basho Bench is a benchmarking tool created to conduct accurate and repeatable performance tests and stress tests, and produce performance graphs.

Originally developed to benchmark Riak, it exposes a pluggable driver interface and has been extended to serve as a benchmarking tool across a variety of projects.

Basho Bench focuses on two metrics of performance:

Throughput: number of operations performed in a timeframe, captured in aggregate across all operation types
Latency: time to complete single operations, captured in quantiles per-operation

** Quick Start

You must have [[http://erlang.org/download.html][Erlang/OTP 19]] or later to build and run Basho Bench, and [[http://www.r-project.org/][R]] to generate graphs of your benchmarks. A sane GNU-style build system is also required if you want to use =make= to build the project.

#+BEGIN_SRC shell git clone git://github.com/mrallen1/basho_bench.git cd basho_bench make all #+END_SRC

This will build an executable script, =basho_bench=, which you can use to run one of the existing benchmark configurations from the =examples/= directory. You will likely have to make some minor directory changes to the configs in order to get the examples running (see, e.g., the source of the bitcask and innostore benchmark config files for direction).

#+BEGIN_SRC shell $ ./basho_bench examples/riakc_pb.config INFO: Est. data size: 95.37 MB INFO: Using target ip {127,0,0,1} for worker 1 INFO: Starting max worker: <0.55.0> #+END_SRC

At the end of the benchmark, results will be available in CSV format in the =tests/current/= directory. Now you can generate a graph:

#+BEGIN_SRC shell $make results priv/summary.r -i tests/current Loading required package: proto Loading required package: reshape Loading required package: plyr Loading required package: digest null device 1$ open tests/current/summary.png #+END_SRC

** Running as an Application

You can also run basho_bench as an included app rather than an escript. In order to compile and run basho_bench as an included app, you can use =make run=. You can also include basho_bench as a dependency in another app. basho_bench can be started with =basho_bench:start()= or from the command-line with =erl -s basho_bench=. Once you have a shell with basho_bench running, you can run benchmarks with the following:

#+BEGIN_SRC erlang basho_bench:setup_benchmark([]). basho_bench:run_benchmark(["/path/to/config"]). #+END_SRC

=basho_bench:setup_benchmark/1= accepts a proplist of configuration options. These options are the same as the options which are passed to basho_bench via the command line when compiled as an escript.

** Troubleshooting Graph Generation

If make results fails with the error =/usr/bin/env: Rscript --vanilla: No such file or directory= please edit priv/summary.r and replace the first line with the full path to the Rscript binary on your system

If you receive the error message =Warning: unable to access index for repository http://lib.stat.cmu.edu/R/CRAN/src/contrib= it means the default R repo for installing additional packages is broken, you can change it as follows:

#+BEGIN_SRC shell $ R

chooseCRANmirror() Selection: 69 quit() make results #+END_SRC

** Customizing your Benchmark Basho Bench has many drivers, each with its own configuration, and a number of key and value generators that you can use to customize your benchmark. It is also straightforward -- with less than 200 lines of Erlang code -- to create custom drivers that can exercise other systems or perform custom operations. These are covered more in detail in the [[http://docs.basho.com/riak/latest/ops/building/benchmarking/][documentation]].

*** Duration vs Op-Count termination Benchmarks by default are time-duration limited, meaning you specify a number of minutes for the benchmark to run using the duration configuration parameter. Additionally you may specify the op_counter configuration parameter, which is set up as an iteration-limited Key Generator, such as the partitioned_sequential_int. Once this op_counter has been exhausted for a particular worker, that worker will quit, waiting for other workers to complete before shutting down basho_bench.

Example Configuration:

{op_counter, {int_to_str, {partitioned_sequential_int, 100}}}.

#+BEGIN_SRC shell %% {escript_emu_args, "%%! -name bb@127.0.0.1 -setcookie YOUR_ERLANG_COOKIE\n"}. #+END_SRC

** Alternative Graph Generation by gnuplot You can generate graphs using gnuplot.

#+BEGIN_SRC shell $ ./priv/gp_throughput.sh #+END_SRC

#+BEGIN_SRC shell $ ./priv/gp_latencies.sh #+END_SRC

By passing =-h= option to each script, help messages are shown.

Some of options for these scripts are:

=-d TEST_DIR= : comma separated list of directories which include test result CSV files
=-t TERMINAL_TYPE= : gnuplot terminal type
=-P= : just print gnuplot script without drawing graph

For example, you can draw graphs with ASCII characters by the option =-t dumb=, which is useful in non-graphical environment or quick sharing of result in chat.

Also, you can plot multiple test runs on a single plot by using "-d" switch.

** Benchmarking Erlang cluster

A typical benchmark scenario is that Basho Bench spawn Erlang VM and executes the driver inside. However, there is needs to catch performance metrics from an application executed remotely within dedicated environment (e.g. probe performance from live system; benchmark an application inside C or Java node, etc). Bash Bench implements a generic =basho_bench_driver_cluster= that acts as proxy. It uses Erlang distribution to delegate benchmark responsibility to remote actor, which is randomly selected from configured pool.

Basho Bench do not define how the actors are spawned within SUT. It only defined a communication protocol. The actor is responsible to handle the message:

={pid(), atom(), key(), val()}=

=pid()= : request originator, actor shall respond to this process
=atom()= : id of operation to execute as defined in config file
=key()= : materialized key value as defined by key generator function
=val()= : materialized value as defined by value generator function

The actor executes the request, measures performance and respond to originator process =pid()= with one of the message ={ok, microsecond()}= or ={error, reason()}=

See cluster.config example for details. Use following command to spawn benchmark

#+BEGIN_SRC shell ./basho_bench -C nocookie -N bb@127.0.0.1 -J erlang@127.0.0.1 examples/cluster.config #+END_SRC