Learnable-Image-Resizing

TensorFlow 2 implementation of Learning to Resize Images for Computer Vision Tasks by Talebi et al.

Accompanying blog post on keras.io: Learning to Resize in Computer Vision.

The above-mentioned paper proposes a simple framework to optimally learning representations for a given network architecture and given image resolution (such as 224x224). The authors find that the representations that are more coherent with the human perception system may not always improve the performance of vision models. Instead, optimizing the representations that are better suited for the models can substantially improve their performance.

The diagram presents the proposed learnable resizer module (source: original paper):

Here's how the resized images look like after being passed through a learned resizer:

On the left hand side, we see the outputs of an untrained learnable resizer. On the right, the outputs are from the same learnable resizer but with 10 epochs of training. The images may not make sense to our eyes in terms of their perceptual quality, but they help to improve the recognition performance of the vision models.

About the notebooks

Standard_Training.ipynb: Shows how to train a DenseNet-121 on the Cats and Dogs dataset with bilinear resizing (150 x 150).
Learnable_Resizer.ipynb: Shows how to train the same network with the learnable resizing module included. Here, the inputs are first resized to 300 x 300 and then the learnable resizer module helps learn optimal representations for 150 x 150.

These incorporate mixed-precision training along with distributed training.

Results

Model	Number of parameters (Million)	Top-1 accuracy
With learnable resizer	7.051717	67.67%
Without learnable resizer	7.039554	60.19%

Both the models were trained for only 10 epochs from the same initial checkpoint.

You can reproduce these results with the model weights provided here.

Paper citation

@InProceedings{Talebi_2021_ICCV,
    author    = {Talebi, Hossein and Milanfar, Peyman},
    title     = {Learning To Resize Images for Computer Vision Tasks},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {497-506}
}

Acknowledgements

ML-GDE program for providing GCP credit support.
Mark Doust (of Google) for feedback.

Related Projects

stable-diffusion-keras-ft

Fine-tuning Stable Diffusion using Keras.

24 Dec 2022 56

image-super-resolution

🔎 Super-scale your images and run experiments with Residual Dense and Adversarial Networks.

26 Nov 2018 4,622

SimCLR-in-TensorFlow-2

(Minimally) implements SimCLR (https://arxiv.org/abs/2002.05709) in TensorFlow 2.

20 Apr 2020 82

Supervised-Contrastive-Learning-in-TensorFlow-2

Implements the ideas presented in https://arxiv.org/pdf/2004.11362v1.pdf by Khosla et al.

20 May 2020 127

Training-BatchNorm-and-Only-BatchNorm

Experiments with the ideas presented in https://arxiv.org/abs/2003.00152 by Frankle et al.

03 May 2020 24

A-Barebones-Image-Retrieval-System

This project presents a simple framework to retrieve images similar to a query image.

29 Jul 2020 25

maxim-tf

Implementation of MAXIM in TensorFlow.

30 Sep 2022 116

Deep-learning-with-Python

Deep learning codes and projects using Python

03 Jul 2019 335

vision-transformers-tf

A non-exhaustive collection of vision transformer models implemented in TensorFlow.

25 Sep 2022 7

PAWS-TF

Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

05 May 2021 43

probing-vits

Probing the representations of Vision Transformers.

12 Mar 2022 306

vit-keras

Keras implementation of ViT (Vision Transformer)

07 Nov 2020 337

mobilenetv2-yolov3

yolov3 with mobilenetv2 and efficientnet

09 Mar 2019 291

sparseml

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabli...

11 Dec 2020 1,976

keras-vggface

VGGFace implementation with Keras Framework

17 Oct 2016 933