Class-Parallel Tsetlin Machine

October 27, 2022 · View on GitHub

Feedback decision tree visualised:
tsetlin_decision_tree-combined.pdf

Usage

./tm [options]
optiontypedescription
-step‑sizeintnumber of inputs per training step
-stepsinttotal number of training steps
-sfloatlearning rate s
-boost‑pos0 or 1boost positive feedback; default is 0
-tfloatthreshold T
-tsfloat,float,...comma-separated threshold list; this way you can set different T values for each class
-tnorm0 or 1T values are normalized; default is 1
-rand‑seedintuse specific random seed or timer if 0 (default)
-acc‑eval‑trainintcurrently disabled, dooes nothing
-acc‑eval‑testintuse test dataset to evaluate accuracy: 0 - don't evaluate, -1 evaluate once at the end of training, n - evaluate every n steps
-log‑tastatesif specified, enable logging of the TA state spectrum
-log‑statusif specified, enable logging of the TM status variables and events
-log‑accif specified, enable logging of the accuracy evaluation results
-log‑appendif specified, append to the existing log files; default is rewrite
-load‑statestringif specified, load previously saved state of the TM; the string value is path format, where %d is replaced by a class index
-save‑statestringif specified, the state of TM is saved after training; the string value is path format, where %d is replaced by a class index
-train‑maskstringbinary mask to enable training per class; default is 11111..., i.e., every class is training
-par0 or 1enable parallel execution; default is 1

Example:

./tm -step-size 12000 -steps 5 -acc-eval-test 1 -log-acc

Train MNIST for 5 epochs, evaluate accuracy after each epoch (step) and log it.

TsetlinOptions.h

Some parameters are hard-coded in TsetlinOptions.h and require recompilation when changed.

optiontypevaluedescription
FEATURESint(28*28)number of input features; defaulted to MNIST
CLASSESint10number of classes; defaulted to 10 for MNIST
CLAUSESint200number of clauses per class
NUM_STATESint100number of TA states per decision; exclude states are (-NUM_STATES+1) .. 0, include states are 1 .. NUM_STATES
LIT_LIMIT0 or 10toggle literal-limiting feedback algorithm
INPUT_DATA_PATHchar*"pkbits"path to input data directory
TRAIN_DATA_FMTchar*"/mnist-train-cls%d.bin"format of the train data input file (per-class, %d is replaced with the class index); the file is in pkbits format
TEST_DATAchar*"/mnist-test.bin"test data file name; the file is in pkbits format

MNIST training and test data is included in pkbits.zip.

Build instructions

Using precompiled logger files

Logger headers TsetlinLogger.h and TsetlinLoggerDefs.h were pre-built for the logger.xml configuration. If you don't need to change logger functionality, you can use these precompiled files.

To build:

make quick

To clean:

make clean

Recompile logger files

Do not change TsetlinLogger.h and TsetlinLoggerDefs.h directly. If you need to modify logger, edit logger.xml and rebuild logger files.

To build:

make all

Requirements:

Edit GEN_LOGGER variable in the makefile to set path to AuxTsetlinTools.

To clean including TsetlinLogger.h and TsetlinLoggerDefs.h:

make cleanall

More information on logger generator and logger.xml specification: Logger XML specification

Plotting diagrams

Logged TM status variables can be plotted as SVG images using Jython scripts in the plots folder.

Requirements:

Download and put into JAR files into plots folder. Modify dataPath variable in All.py script to point to the location of the acc.csv and *-status.csv files generated by the TM logging.

To plot diagrams, use:

java -jar jython-standalone-2.7.2.jar All.py > All.svg

There is a separate tool for drawing TA spectrum diagrams: TA State Spectrogram