The Tsetlin Machine library with zero external dependencies performs quite well.
MIT License
Speed is the most important feature.
Fred Wilson
The Tsetlin Machine library, with zero external dependencies, performs blazingly fast. Over 180 million MNIST predictions per second, with a throughput of 17 GB/s, were achieved on a desktop CPU.
Here is a quick "Hello, World!" example of a typical use case.
Importing the necessary functions and MNIST dataset:
using MLDatasets: MNIST
using .Tsetlin: TMInput, TMClassifier, train!, predict, accuracy, save, load, unzip, booleanize
x_train, y_train = unzip([MNIST(:train)...])
x_test, y_test = unzip([MNIST(:test)...])
Booleanizing input data (2 bits per pixel):
x_train = [booleanize(x, 0, 0.5) for x in x_train]
x_test = [booleanize(x, 0, 0.5) for x in x_test]
There are some different hyperparameters compared to the Vanilla Tsetlin Machine.
The hyperparameter R
is a float in the range of 0.0
to 1.0
.
To get the actual R
from the Vanilla S
parameter, use the following formula: R = S / (S + 1)
.
The hyperparameter L
limits the number of included literals in a clause.
best_tms_size
is the number of the best TM models collected during the training process.
After training, you can save this ensemble of models to your drive or increase accuracy by using Binomial Combinatorial Merge with the combine()
function.
const EPOCHS = 1000
const CLAUSES = 2048
const T = 32
const R = 0.94
const L = 12
const best_tms_size = 500
Training the Tsetlin Machine over 1000 epochs and saving the best TM model to disk:
tm = TMClassifier{eltype(y_test)}(CLAUSES, T, R, L=L, states_num=256, include_limit=128)
tms = train!(tm, x_train, y_train, x_test, y_test, EPOCHS, best_tms_size=best_tms_size, best_tms_compile=true, shuffle=true, batch=true)
save(tms[1][1], "/tmp/tm_best.tm")
Load the best Tsetlin Machine model and calculate the actual test accuracy:
tm = load("/tmp/tm_best.tm")
println(accuracy(predict(tm, x_test), y_test))
cd ./examples
julia --project=. -O3 -t 32 --gcthreads=32,1 mnist_simple.jl
where 32
is the number of your logical CPU cores.The maximum MNIST inference speed achieved is 186 million predictions per second (with a throughput of 17 GB/s) in batch mode on a Ryzen 7950X3D desktop CPU, utilizing 32 threads.
Trained and optimized models can be found in ./examples/models/
.
How to run MNIST inference benchmark:
cd ./examples
julia --project=. -O3 -t 32 mnist_benchmark_inference.jl
where 32
is the number of your logical CPU cores.