Class-Parallel Tsetlin Machine
October 27, 2022 · View on GitHub
Feedback decision tree visualised:
tsetlin_decision_tree-combined.pdf
Usage
./tm [options]
| option | type | description |
|---|---|---|
| -step‑size | int | number of inputs per training step |
| -steps | int | total number of training steps |
| -s | float | learning rate s |
| -boost‑pos | 0 or 1 | boost positive feedback; default is 0 |
| -t | float | threshold T |
| -ts | float,float,... | comma-separated threshold list; this way you can set different T values for each class |
| -tnorm | 0 or 1 | T values are normalized; default is 1 |
| -rand‑seed | int | use specific random seed or timer if 0 (default) |
| -acc‑eval‑train | int | currently disabled, dooes nothing |
| -acc‑eval‑test | int | use test dataset to evaluate accuracy: 0 - don't evaluate, -1 evaluate once at the end of training, n - evaluate every n steps |
| -log‑tastates | if specified, enable logging of the TA state spectrum | |
| -log‑status | if specified, enable logging of the TM status variables and events | |
| -log‑acc | if specified, enable logging of the accuracy evaluation results | |
| -log‑append | if specified, append to the existing log files; default is rewrite | |
| -load‑state | string | if specified, load previously saved state of the TM; the string value is path format, where %d is replaced by a class index |
| -save‑state | string | if specified, the state of TM is saved after training; the string value is path format, where %d is replaced by a class index |
| -train‑mask | string | binary mask to enable training per class; default is 11111..., i.e., every class is training |
| -par | 0 or 1 | enable parallel execution; default is 1 |
Example:
./tm -step-size 12000 -steps 5 -acc-eval-test 1 -log-acc
Train MNIST for 5 epochs, evaluate accuracy after each epoch (step) and log it.
TsetlinOptions.h
Some parameters are hard-coded in TsetlinOptions.h and require recompilation when changed.
| option | type | value | description |
|---|---|---|---|
FEATURES | int | (28*28) | number of input features; defaulted to MNIST |
CLASSES | int | 10 | number of classes; defaulted to 10 for MNIST |
CLAUSES | int | 200 | number of clauses per class |
NUM_STATES | int | 100 | number of TA states per decision; exclude states are (-NUM_STATES+1) .. 0, include states are 1 .. NUM_STATES |
LIT_LIMIT | 0 or 1 | 0 | toggle literal-limiting feedback algorithm |
INPUT_DATA_PATH | char* | "pkbits" | path to input data directory |
TRAIN_DATA_FMT | char* | "/mnist-train-cls%d.bin" | format of the train data input file (per-class, %d is replaced with the class index); the file is in pkbits format |
TEST_DATA | char* | "/mnist-test.bin" | test data file name; the file is in pkbits format |
MNIST training and test data is included in
pkbits.zip.
Build instructions
Using precompiled logger files
Logger headers TsetlinLogger.h and TsetlinLoggerDefs.h were pre-built for the logger.xml configuration. If you don't need to change logger functionality, you can use these precompiled files.
To build:
make quick
To clean:
make clean
Recompile logger files
Do not change TsetlinLogger.h and TsetlinLoggerDefs.h directly. If you need to modify logger, edit logger.xml and rebuild logger files.
To build:
make all
Requirements:
- Java 8 (JDK); install with
sudo apt install openjdk-8-jdk - Logger generator
Edit GEN_LOGGER variable in the makefile to set path to AuxTsetlinTools.
To clean including TsetlinLogger.h and TsetlinLoggerDefs.h:
make cleanall
More information on logger generator and
logger.xmlspecification: Logger XML specification
Plotting diagrams
Logged TM status variables can be plotted as SVG images using Jython scripts in the plots folder.
Requirements:
Download and put into JAR files into plots folder. Modify dataPath variable in All.py script to point to the location of the acc.csv and *-status.csv files generated by the TM logging.
To plot diagrams, use:
java -jar jython-standalone-2.7.2.jar All.py > All.svg
There is a separate tool for drawing TA spectrum diagrams: TA State Spectrogram