Self-Supervised Learning for Fine-Grained Image Categorization
MIT License
The repository contains the implementation of the project "Self Supervised Learning for Fine-grained Categorization". The project examines the effectiveness of various SSL methods for a FGVC problem. The repository implements self-supervision as an auxiliary task to a baseline model for fine-grained visual categorization (FGVC) task. Specifically, it provides the implementation for rotation [1], pretext invariant representation learning (PIRL) [2] and destruction and construction learning (DCL) [3] as auxiliary tasks for the baseline model.
The list of implemented model architectures can be found at here.
All the functionalities of this repository can be accessed using a .yml
configuration file.
The details related to the configuration parameters can be found at here.
We also provide sample configuration files at ./config/*
for each implemented method as listed below.
It is recommended to create a new conda environment for this project. The installation steps are as follows:
$ conda create --name=ssl_for_fgvc python=3.8
$ conda activate ssl_for_fgvc
$ pip install -r requirements.txt
All the pretrained models can be found at click_me.
In order to evaluate a model, download the model
checkpoints from the link and use scripts/evaluate.py
script for evaluating the model on the test set.
$ cd scripts
$ python evaluate.py --config_path=<path to the corresponding configuration '.yml' file.> \
--model_checkpoints=<path to the downloaded model checkpoints> \
--root_dataset_path=<path to the dataset root directory>
If the --root_dataset_path
command line parameter has not been provided to evaluate.py
script, it will download the dataset
and perform the testing. The downloading of data may take some time based on the network stability and speed. For more information run,
$ python evaluate.py --help
For example, in order to evaluate the DCL model, download the corresponding checkpoints
(let's say in the scripts
directory as ssl_dcl/best_checkpoints.pth
)
and run the following commands.
$ cd scripts
$ python evaluate.py --config_path=../config/ssl_dcl.yml --model_checkpoints=./ssl_dcl/best_checkpoints.pth
The expected outputs after running the command are given below.
The end-to-end training functionality can be accessed using the main.py
script.
The script takes pipeline config (.yml
) file as command line parameter and initiates the corresponding training.
$ python main.py --config_path=<path to the corresponding configuration '.yml' file.>
For more information run,
$ python main.py --help
For example, to train a DCL model run,
$ python main.py --config_path=./config/ssl_dcl.yml
The expected outputs after running the command are given below.
The repository also provides the functionality to generate class activation maps (CAMs)
for the trained model on the whole test dataset. The script scripts/cam_visualizations.py
exposes this functionality. Run the following commands to generate CAMs for the trained model.
$ cd scripts
$ python cam_visualizations.py --config_path=<path to the corresponding configuration '.yml' file.> \
--model_checkpoints=<path to the downloaded model checkpoints> \
--root_dataset_path=<path to the dataset root directory> \
--output_directory=<path to ouput directory to save the visualizations>
If the parameter --root_dataset_path
is not provided, the program will automatically download the dataset
and generate the visualizations. For more information run,
$ python cam_visualizations.py --help
We also provide Dockerfile
for containerization and docker-compose.yml
file for running the training as service.
Follow the below steps to run the training as a service,
$ cd scripts
$ bash install_docker_dependencies.sh
$ docker build -t ssl_for_fgvc:v1.0
Where ssl_for_fgvc:v1.0
is the docker image name.
$ docker-compose up -d
$ docker-compose logs -f ssl_for_fgvc
[1] Gidaris, Spyros, et al. "Boosting few-shot visual learning with self-supervision." [2] Misra, Ishan, and Laurens van der Maaten. "Self-supervised learning of pretext-invariant representations." [3] Chen, Yue, et al. "Destruction and construction learning for fine-grained image recognition." [4] Sun, Guolei, et al. "Fine-grained recognition: Accounting for subtle differences between similar classes."