MIT License
Many photographers have been taking images of birds and wondering what kind of bird it actually is.
A bunch of data scientists have been working on a model to help them out.
While the model* is performing well a lot of corners were cut to get this model to production** and the service could certainly use some love from a software engineer.
Your task is to:
You can change all parts of the code as you see fit, however:
By the end of this task we would like to see, what is a good looking code in your opinion and how much can you optimize latency.
Feel free to play around with the code as much as you like, but in the end we want to see:
Bonus
python classifier.py <url1> <url2>
The task doesnt have a fixed time constraint, but we certainly dont expect you to spend more than 8h.
pip install -r requirements.txt
python classifier.py
gl;hf
* The model: The sample model is taken from Tensorflow Hub: https://tfhub.dev/google/aiy/vision/classifier/birds_V1/1
The labels for model outputs can be found here: https://www.gstatic.com/aihub/tfhub/labelmaps/aiy_birds_V1_labelmap.csv
The model has been verified to run with TensorFlow 2.
** Production: The code was deployed as a python service using Docker with Kubernetes for the infrastructure layer.
There is a CLI application that can be run with python classifier.py
The application can be passed a list of URLS to images to classify. If no URLs are passed, the application will use a default list of URLs.
The application uses multiprocessing to download the images and classify them in parallel.
It accepts the following arguments:
Options: --spawn / --no-spawn Spawn a new process for each image. [default: no-spawn] --workers INTEGER Number of workers. [default: half the available cores] --install-completion [bash|zsh|fish|powershell|pwsh] Install completion for the specified shell. --show-completion [bash|zsh|fish|powershell|pwsh] Show completion for the specified shell, to copy it or customize the installation. --help Show the help message and exit.
There is also a webapp. It can be run with docker-compose build
, docker-compose push
and docker-compose up
. It will be available at http://localhost:8000.
The url explorer can be found at http://127.0.0.1:8000/docs.
The webapp uses RQ workers to process the images. The number of workers can be configured in the docker-compose.yml file. Each image is classified in a separate worker. The workers are run in a separate docker container. The webapp is run in a separate docker container. The webapp and the workers communicate via Redis. All the images are tagged to a batch_id which can be used to get the results for the batch.
The webapp tests can be run with python -m pytest . --ignore=data
.