sklearn2docker

Convert your trained scikit-learn classifier to a Docker container with a pre-configured API

Installation

The easiest way to install sklearn2docker with all its dependencies is through pip:

pip install git+git://github.com/KhaledSharif/sklearn2docker.git

Getting started

First, create your sklearn classifier. In this example we will use the Iris dataset.

from pandas import DataFrame
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

iris = load_iris()
input_df = DataFrame(data=iris['data'], columns=iris['feature_names'])
clf = DecisionTreeClassifier(max_depth=2)
clf.fit(input_df.values, iris['target'])

Second, import the Sklearn2Docker class and use it to build your container.

from sklearn2docker.constructor import Sklearn2Docker

s2d = Sklearn2Docker(
    classifier=clf,
    feature_names=iris['feature_names'],
    class_names=iris['target_names'].tolist()
)
s2d.save(name="classifier", tag="iris")

The name and tag arguments we passed to the save function are the name and tag of the Docker container we just built (see: docker tag). Below is an example of the output of the s2d.save() line we executed above.

Now attempting to run the command: 
[docker build --file /tmp/tmpywbu3_ad/Dockerfile 
 --tag classifier:iris /tmp/tmpywbu3_ad]
=====================================================================
> Sending build context to Docker daemon
> Step 1/6 : FROM python:3.6
> ---> c1e459c00dc3
... output truncated ...
> Step 6/6 : ENTRYPOINT python /code/api.py
> ---> Running in bd61983358d9
> Removing intermediate container bd61983358d9
> ---> fa2041ac6d60
> Successfully built fa2041ac6d60
> Successfully tagged classifier:iris
=====================================================================
Success! You can now run your Docker container using the following command:
	 docker run -d -p 5000:5000 classifier:iris

You can now test your container by asking it to predict the same Iris dataset and return the predicted probabilities (see: predict_proba) as a DataFrame.

from os import system
system("docker run -d -p 5000:5000 classifier:iris && sleep 5")

from requests import post
from pandas import read_json
request = post("http://localhost:5000/predict_proba/split", json=input_df.to_json(orient="split"))
result = read_json(request.content.decode(), orient="split")
print(result.head())

   setosa  versicolor  virginica
0       1         0.0        0.0
1       1         0.0        0.0
2       1         0.0        0.0
3       1         0.0        0.0
4       1         0.0        0.0

You can also request regular classification (see: predict). The format for the URL for your Docker container is as so:

http://[a]:[b]/[c]/[d]

a: the hostname of the container, defaults to `localhost`
b: the port of the container, defaults to 5000
c: one of `predict` or `predict_proba`, similar to the scikit-learn api
d: defaults to `split`; orient of the Pandas DataFrame JSON conversion*

(*: see this documentation article for more information about Pandas orients, and this Github issue for a comparison; most of the time, setting the orient to split should do just fine)

request = post(
    "http://localhost:5000/predict/split", 
    json=input_df.to_json(orient="split")
)

  prediction
0     setosa
1     setosa
2     setosa
3     setosa
4     setosa

Package Rankings

Top 22.39% on Pypi.org

Badges

Extracted from project README

Related Projects

bunruija

A text classification toolkit

10 Oct 2020 4

skoot

A package for data science practitioners. This library implements a number of helpful, common dat...

28 Mar 2018 55

sklearn-classification

Data Science Notebook on a Classification Task, using sklearn and Tensorflow.

12 Aug 2017 691

sklearn-weka-plugin

Makes Weka algorithms available in scikit-learn, by using python-weka-wrapper3 under the hood.

10 Mar 2021 16

amazon-sagemaker-local-mode

Amazon SageMaker Local Mode Examples

05 Nov 2020 245

modelib

A minimalist framework for online deployment of sklearn-like models

02 Feb 2024 0

ml-cheatsheet

A constantly updated python machine learning cheatsheet

11 Apr 2017 166

modelcreator

Simple python package for creating predictive models

03 Apr 2020 6

node-red-contrib-machine-learning

Machine learning with scikit-learn and tensorflow for node-red.

22 Feb 2018 36

simple_learn

python package to simplify data modeling built on top of sklearn

21 Nov 2020 6

human-learn

Natural Intelligence is still a pretty good idea.

11 Jul 2020 792

sklearn-porter

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.

22 Jun 2016 1,284

skll

SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.

02 Aug 2013 551

datawaza

Data science tools for exploration, visualization, and model iteration.

21 Aug 2023 3

scikit-multilearn

A scikit-learn based module for multi-label et. al. classification

30 Apr 2014 919