torchdistill

A coding-free framework built on PyTorch for reproducible deep learning studies. 🏆25 knowledge distillation methods presented at CVPR, ICLR, ECCV, NeurIPS, ICCV, etc are implemented so far. 🎁 Trained models, training logs and configurations are available for ensuring the reproducibiliy and benchmark.

MIT License

Downloads
1.6K
Stars
1.4K
Committers
3

Bot releases are hidden (Show)

torchdistill - Support more detailed training configs and update official configs

Published by yoshitomo-matsubara over 3 years ago

Updated official README and configs

  • More detailed instructions (PRs #55, #56)
  • Restructured official configs (PR #55)
  • Updated FT config for ImageNet (PR #55)

Support detailed training configurations

  • Step-wise parameter update besides epoch-wise parameter update (PR #58)
  • Gradient accumulation (PR #58)
  • Max gradient norm (PR #58)

Bug/Typo fixes

  • Bug fixes (PRs #54, #57)
  • Typo fixes (PRs #53, #58)
torchdistill - Google Colab Examples and bug fixes

Published by yoshitomo-matsubara almost 4 years ago

New examples

  • Added sample configs for CIFAR-10 and CIFAR-100 datasets
  1. Training without teacher (i.e., using TrainingBox) for CIFAR-10 and CIFAR-100 (PR #48)
  2. Knowledge distillation for CIFAR-10 and CIFAR-100 (PR #50)
  • Added Google Colab examples (PR #51)
  1. Training without teacher for CIFAR-10 and CIFAR-100
  2. Knowledge distillation for CIFAR-10 and CIFAR-100

Bug fixes

  • Fixed a bug in init of DenseNet-BC (PR #48)
  • Resolved checkpoint name conflicts (PR #49)
torchdistill - TrainingBox, PyTorch Hub, random split, pretrained models for CIFAR-10 and CIFAR-100 datasets

Published by yoshitomo-matsubara almost 4 years ago

New features

  • Added TrainingBox to train models without teachers (PR #39)
  • Supported PyTorch Hub in registry (PR #40)
  • Supported random split e.g., split training dataset into training and validation datasets (PR #41)
  • Added reimplemented models for CIFAR-10 and CIFAR-100 datasets (PR #41)

Pretrained models

Referred to the following repositories for training methods.

Note that there are some accuracy gaps between these and those reported in their original studies.

CIFAR-10 CIFAR-100
ResNet-20 91.92 N/A
ResNet-32 93.03 N/A
ResNet-44 93.20 N/A
ResNet-56 93.57 N/A
ResNet-110 93.50 N/A
WRN-40-4 95.24 79.44
WRN-28-10 95.53 81.27
WRN-16-8 94.76 79.26
DenseNet-BC (k=12, depth=100) 95.53 77.14
torchdistill - Extended ForwardHookManager and bug fix

Published by yoshitomo-matsubara almost 4 years ago

  • Extended ForwardHookManager (Issue #32 PR #33)
  • Fixed bugs around post_forward function caused by a gathering paradigm introduced to I/O dict (Issue #34 PR #35)
torchdistill - The first release of torchdistill

Published by yoshitomo-matsubara almost 4 years ago

torchdistill

The first release of torchdistill with code and assets for "torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation"