animus

Minimalistic framework to run machine learning experiments.

APACHE-2.0 License

Downloads
144
Stars
27
Committers
1

Animus

One framework to rule them all.

Animus is a "write it yourself"-based machine learning framework. Please see examples/ for more information. Framework architecture is mainly inspired by Catalyst.

FAQ

Animus is a general-purpose for-loop-based experiment wrapper. It divides ML experiment with the straightforward logic:

def run(experiment):
    for epoch in experiment.epochs:
        for dataset in epoch.datasets:
            for batch in dataset.batches:
                handle_batch(batch)

Each for encapsulated with on_{for}_start, run_{for}, and on_{for}_end for customisation purposes. Moreover, each for has its own metrics storage: {for}_metrics (batch_metrics, dataset_metrics, epoch_metrics, experiment_metrics).

Any high-level ML/DL libraries, like Catalyst, Ignite, FastAI, Keras, etc.

Although I find high-level DL frameworks an essential step for the community and the spread of Deep Learning (I have written one by myself), they have a few weaknesses.

First of all, usually, they are heavily bounded to a single "low-level" DL framework (Jax, PyTorch, Tensorflow). While "low-level" frameworks become close each year, high-level frameworks introduce different synthetic sugar, which makes it impossible for a fair comparison, or complementary use, of "low-level" frameworks.

Secondly, high-level frameworks introduce high-level abstractions, which:

  • are built with some assumptions in mind, which could be wrong in your case,
  • can cause additional bugs - even "low-level" frameworks have quite a lot of them,
  • are really hard to debug/extend because of "user-friendly" interfaces and extra integrations.

While these steps could seem unimportant in common cases, like supervised learning with (features, targets), they became more and more important during research and heavy pipeline customization (e.g. privacy-aware multi-node distributed training with custom backpropagation).

Thirdly, many high-level frameworks try to divide ML pipeline into data, hardware, model, etc layers, making it easier for practitioners to start ML experiments and giving teams a tool to separate ML pipeline responsibility between different members. However, while it speeds up the creation of ML pipelines, it disregards that ML experiment results are heavily conditioned on the used model hyperparameters, and data preprocessing/transformations/sampling, and hardware setup. I found this the main reason why ML experiments fail - you have to focus on the whole data transformation pipeline simultaneously, from raw data through the training process to distributed inference, which is quite hard. And that's the reason Animus has Experiment abstraction (Catalyst analog - IRunner), which connects all parts of the experiment: hardware backend, data transformations, model train, and validation/inference logic.

Highlight common "breakpoints" in ML experiments and provide a unified interface for them.

Research experiments, where you have to define everything on your own to get the results right.

No. That's the case - only pure Python libraries. PyTorch and Keras could be used for extensions.

No. Animus core is about 300 lines of code, so it's much easier to read than 3000 lines of documentation.

Demo

Package Rankings
Top 16.26% on Pypi.org
Badges
Extracted from project README
Open In Colab Open In Colab Open In Colab
Related Projects