torchdistill

A coding-free framework built on PyTorch for reproducible deep learning studies. 🏆25 knowledge distillation methods presented at CVPR, ICLR, ECCV, NeurIPS, ICCV, etc are implemented so far. 🎁 Trained models, training logs and configurations are available for ensuring the reproducibiliy and benchmark.

MIT License

Downloads
1.6K
Stars
1.4K
Committers
3

Bot releases are hidden (Show)

torchdistill - A new KD method, new benchmark results, and updated YAML constructors Latest Release

Published by yoshitomo-matsubara 2 months ago

New method

  • Add KD with logits standardization (PR #460)

YAML configs

  • Fix the official config for SRD (Issue #471, PR #473)
  • Fix SRD config (Issue #471, PR #472)
  • Add os.path YAML constructors (PR #454)

Logs

  • Disable an auto-configuration for def_logger (Issue #465, PR #469)
  • Use warning (PR #468)

Documentation

  • Add a new benchmark (PR #464)
  • Update Projects page (PRs #456, #475)

Misc

  • Update README (PRs #461, #470)
  • Update a URL (PR #459)
  • Update GH Action vers (PRs #457, #458)
  • Update CITATION (PR #455)
  • Add a new DOI badge (PR #453)
  • Update version (PRs #452, #479)
torchdistill - New KD methods, updated YAML constructors, and low-level loss support

Published by yoshitomo-matsubara 7 months ago

New methods

  • Add SRD method (PRs #436, #444, #446)
  • Add Knowledge Distillation from A Stronger Teacher method (PR #433)
  • Add Inter-Channel Correlation for Knowledge Distillation method (PR #432)

YAML constructor

  • Update functions in yaml_util (PR #447)
  • Fix docstrings and add import_call_method & yaml constructor (PR #442)

Distillation/Training boxes

  • Enable auxiliary model wrapper builder to redesign input model (PR #437)

Registries

  • Add low-level registry and get functions (PR #426)

Documentation

  • Update benchmarks (PR #435)
  • Fix a typo (PR #424)

Examples

Tests

  • Add a test case for import_call_method (PR #443)
  • Add import test (PR #441)

Misc

  • Update citation info (PRs #438, #439, #440)
  • Update publication links (PR #430)
  • Update version (PR #425, #449, #451)
  • Update README (PRs #423, #434, #450)
  • Update image url (PR #421)
torchdistill - New generation with new features and documentation

Published by yoshitomo-matsubara 12 months ago

torchdistill v1.0.0 Release Notes

This major release supports PyTorch 2.0 and contains a lot of new features, documentation support, and breaking changes.

PyYAML configurations and executable scripts with torchdistill <= v0.3.3 should be considered "legacy" and are no longer supported by torchdistill >= v1.0.0. New PyYAML configurations and executable scripts are provided for the major release.

This release adds support for Python 3.10 and 3.11, and Python 3.7 is no longer supported.

Documentation

  • Update documents (PRs #400, #408)
  • Add docstrings (PRs #392, #393, #394, #395, #396, #397)
  • Add torchdistill logos (PRs #401, #402, #403)

Dependencies & Instantiation

  • Add getattr constructor (PR #325)
  • Make package arg optional (PR #322)
  • Enable dynamic module import/get/call (PR #319)
  • Add a function to import dependencies e.g., to register modules (PR #265)

Module registry

  • Add *args (PR #345)
  • Fix default value-related issues (PR #327)
  • No longer use lowered keys (PR #326, #332)
  • Disable lowering by default (PR #323)
  • Rename type/name key (PR #312)
  • Rename registry dicts and arguments for registry key (PR #269)
  • Raise errors when requested module keys are not registered (PR #263)
  • Enable naming modules to be registered (PR #262)

Distillation/Training boxes

  • Remove default forward_proc for transparency (PR #417)
  • Rename a forward_proc function (PR #414)
  • Simplify (D)DP wrapper init (PR #410)
  • Change the timing to print model setup info (PR #335)
  • Add an option to specify find_unused_parameters for DDP (PR #334)
  • Do not touch teacher model by default (PR #333)
  • Training box does not have to inherit nn.Module class (PR #317)
  • Add interfaces package to core (PR #310)
  • Update forward interfaces (PR #307, #308)
  • Rename post_process post_epoch_process for consistency (PR #306)
  • Consider CosineAnnealingWarmRestarts in default post-epoch process functions (PR #305)
  • Make some common procedures in training box registrable/replaceable (PR #304)
  • Introduce {pre,post}-{epoch,forward} processes and registries (PR #274)
  • Rename post_forward functions (PR #272)
  • Make loss as kwarg (PR #273)

Forward hooks

  • Fix initialization issues in IO dict for SELF_MODULE_PATH (PR #328)

Dataset modules

  • Redesign split_dataset and remove unused functions (PR #360)
  • Update CRD dataset wrapper (PR #352)
  • Fix a bug (PR #351)
  • Add default args and kwargs (PR #347)
  • Add get_dataset (PR #324)

Loss modules

  • Fix a typo (PR #413, #415)
  • Add doc artifacts and an option to pass pre-instantiated loss module (PR #399)
  • Add DictLossWrapper (PR #337)
  • Rename an old function name PR #309)
  • Rename single loss middle-level loss (PR #300)
  • Explicitly define criterion wrapper (PR #298)
  • Change concepts of OrgLoss and org_term (PR #296)
  • Rename loss-related classes and functions (PR #294)
  • Add default forward process function and KDLoss back as a single loss (PR #275)
  • Remove org loss module and introduce self-module path (PR #271)

Model modules

  • Support parameter operations (Discussion #387, PR #388)
  • Replace pretrained with weights (PR #354)

Auxiliary model wrapper modules

  • Add find_unused_parameters arg (PR #340)
  • Rename special in configs to auxiliary_model_wrapper (PR #291)
  • Rename special module for clarity (PR #276)

Optimizer/Scheduler modules

  • Fix bugs around optimizer/scheduler (PR #358)
  • epoch arg is deprecated for some LR schedulers (PR #338)

Examples

  • Revert legacy file paths to non-legacy ones (PR #419)
  • Update kwargs and scripts (PR #382)
  • Update yaml util and sample configs (CIFAR-10, CIFAR-100) for the next major release (PR #361)
  • Update sample script and configs (GLUE) for the next major release (PR #259)
  • --log was replaced with --run_log (PR #350)
  • dst_ckpt should be used when using -test_only (PR #349)
  • Simplify the semantic segmentation script (PR #339)
  • Move hardcoded-torchvision-specific code to local custom package (PR #331)
  • Update world_size, cudnn configs, and checkpoint message (PR #330)
  • Rename log argument due to the (abstract) conflict with torchrun (PR #329)
  • Restructure examples and export some example-specific packages (PR #320)
  • Add an option to disable torch.backend.cudnn.benchmark (PR #316)
  • Support stage-wise loading/saving checkpoints (PR #315)
  • Support src_ckpt and dst_ckpt for initialization and saving checkpoints respectively (PR #314)
  • Use legacy configs and scripts tentatively (PR #292, #295)
  • Add legacy examples and configs (PR #289)

Configs

  • Declare forward_proc explicitly (PR #416)
  • Add configs used in NLP-OSS 2023 paper (PR #407)
  • Fix value based on log (PR #284)
  • Update sample configs (ILSVRC 2012, COCO 2017, and PASCAL VOC 2012) for the next major release (PR #357)
  • Update official configs for the next major release (PR #355)
  • Merge single_/multi_stage directories (PR #346)
  • Rename variables (PR #344)
  • Rename "factor" "weight" (PR #302)
  • Restructure criterion (PR #301)
  • Consistently use "params" to indicate learnable parameters, not hyperparameters (PR #297)

Misc.

  • Add Google Analytics ID (PR #406)
  • Add sitemap.xml (PR #405)
  • Update timm repo (PR #375)
  • Add acknowledgments (PR #369)
  • Update file paths (PR #356)
  • Fix a typo and replace pretrained with weights (PR #353)
  • Remove the dict option as it is not intuitive for building transform(s) (PR #303)
  • Temporarily remove registry test (PR #293)
  • Add an important notice (PR #286)
  • Add read permission for content, following the new template (PR #284)
  • Refactor (PRs #268, #270, #283, #343)
  • Update README (PRs #252, #290, #299, #341, #342, #348, #364, #400, #409, #418)
  • Update versions (PRs #251, #391, #420)

Workflows

  • Add a GitHub Action for deploying Sphinx documentation (PR #404)
torchdistill - Updates, bug fixes, and end of apex support

Published by yoshitomo-matsubara almost 2 years ago

Updates in APIs/scripts

  • Add square-sized random crop option (PR #224)
  • Replace torch.no_grad() with torch.inference_mode() (PR #245)
  • Terminate apex support due to its maintenance mode (PRs #248, #249)

Bug fixes

  • Add a default value (Discussion #229, PR #230)
  • Fix a bug raised in torchvision (PR #231)
  • Fix a default parameter (PR #235)

Misc.

  • Fix a typo (PR #232)
  • Update Travis (PR #236)
  • Update README (PRs #228, #238, #240
  • Update versions (PRs #223, #250)
torchdistill - Minor bug fix and updates

Published by yoshitomo-matsubara over 2 years ago

Bug fix

  • Fix a potential bug in split_dataset (Issue #209, PR #210)

Misc.

  • Update GitHub workflow (PR #217)
  • Add local epoch for LambdaLR (PR #219)
  • Update versions (PRs #208, #220)
torchdistill - Minor updates

Published by yoshitomo-matsubara almost 3 years ago

Minor updates

  • Freeze module before rebuild if applicable (PR #205)
  • Refactor and improve result summary message (PR #206)
  • Update version (PRs #204, #207)
torchdistill - Bug fix

Published by yoshitomo-matsubara almost 3 years ago

Bug fix

  • strict should not be used here (PR #202)

Minor update

  • Update version (PRs #201, #203)
torchdistill - Example and minor updates

Published by yoshitomo-matsubara almost 3 years ago

Example updates

  • Restructure and make download=True (PR #190)
  • Make log_freq configurable for test (PR #191)
  • Refactor (PR #192)
  • Probably torch.cuda.synchronize() is no longer needed (PR #194)
  • Add an option to use teacher output (PR #195)
  • Replace no_grad with inference_mode (PR #199)

Minor updates

  • Add strict arg (PR #193)
  • Add assert error message (PR #196)
  • Check if ckpt file path is string (PR #197)
  • Check if batch images are instance of Tensor (PR #198)
  • Update version (PRs #189, #200)
torchdistill - Add new features, PASCAL examples and pretrained models

Published by yoshitomo-matsubara almost 3 years ago

New features

  • Add wrapped resize to enable specifying interpolation for resize (PR #182)
  • Add wrapped random crop resize to enable specifying interpolation for random crop resize (PR #183)
  • Enable to load ckpt containing only specific module and via URL (PR #187)

New examples and trained models

  • Add examples for PASCAL VOC 2012 (PRs #184, #186)
  • Update README (PR #185)
  • Add model weights of DeepLabv3 with ResNet-50/101 fine-tuned on PASCAL VOC 2012 (Segmentation)
mean IoU global pixelwise acc
DeepLabv3 w/ ResNet-50 80.6 95.7
DeepLabv3 w/ ResNet-101 82.4 96.2

Model implementations are available in torchvision. These model weights are originally pretrained on COCO 2017 dataset (available in torchvision) and then fine-tuned on PASCAL VOC 2012 (Segmentation) dataset.

Minor updates

  • Add a version constant (PR #175)
  • Rename and add functions for ResNet-50 and ResNet-101 (PR #176)
  • Add CITATION file (PR #178)
  • Update version (PRs #174, #188)
  • Update README (PRs #179, #180)
torchdistill - Minor updates and bug fix to support PyTorch v1.10

Published by yoshitomo-matsubara almost 3 years ago

Minor updates

  • Update version (PRs #161, #162, #173)
  • Update README (PRs #163, #164)
  • Add an option to log config (PR #169)

Bug fix

  • In PyTorch v.1.10, load_state_dict_from_url is no longer available in torchvision.models.utils (PR: #172)
torchdistill - Add KTAAD method and improve exampes

Published by yoshitomo-matsubara about 3 years ago

New method

  • Add knowledge translation and adaptation + affinity distillation for semantic segmentation (PR #158)

Minor updates

  • Update version (PRs #151, #160)
  • Update README (PRs #153, #159)
  • Stop training when facing NaN or Infinity (PR #157)
torchdistill - Add knowledge review method and new features

Published by yoshitomo-matsubara about 3 years ago

New method

  • Add knowledge review method (PRs #141, #145, #146)

The experimental result shown in README.md can be reproduced with this yaml file.
The log and checkpoint file (including student model weights) are provided as part of Assets below.

New features

  • Make nn.ModuleList hookable (PR #139)
  • Support negative index in module path (PR #144)

Minor updates

  • Update version (PRs #137, #148)
  • Update README (PRs #138, #147)
  • Fix a typo (PR #142)
torchdistill - Minor updates and potential bug fix in package

Published by yoshitomo-matsubara over 3 years ago

Minor updates

  • Update version (PRs #128, #136)
  • Fix a typo (PR #130)
  • Make pin_memory configurable (PR #134)

Bug fix

  • Clear io_dict in pre-process (Issue #132, PR #135)
torchdistill - Bug fixes in package

Published by yoshitomo-matsubara over 3 years ago

Bug fixes

  • DistributedDataParallel is no longer allowed for wrapping models with no updatable parameters (Issue #122 PRs #124, #125)
  • Fix a bug in detecting collate function type (Issue #123 PR #126)

Misc

  • Update version (PR #127)
torchdistill - Update examples and support PyTorch v1.9.0

Published by yoshitomo-matsubara over 3 years ago

Examples

  • Improve log format (PR #111)
  • Tune hyperparameters for GLUE tasks (PRs #112, #113)
  • Add sample KD configs for GLUE tasks (PR #114)

Misc

  • Update notebooks (PR #115)
  • Update README (PR #116)
  • Update version (PRs #118, #120)
  • Support PyTorch v1.9.0 (PR #119)
torchdistill - Update HF support, examples and notebooks

Published by yoshitomo-matsubara over 3 years ago

Examples

  • Update GLUE example (PRs #97, #98, #99, #104, #106, #108)
  • Enable test prediction to make a submission for GLUE leaderboard (PR #102)
  • Add notebook (PRs #105, #109)

Bug fixes

  • Provide kwargs (PR #94)
  • Enable teacher to run in fp16 mode (PR #110)

Minor updates

  • Update README (PRs #93, #101, #102, #103 #107, #108)
  • Refactor / Fix typos (PRs #95, #96, #100, #101, #104)
torchdistill - Support Hugging Face Transformers and Accelerate

Published by yoshitomo-matsubara over 3 years ago

New features and example

  • Introduce Hugging Face's Accelerate to better collaborate with their Transformers package (PR #91)
  • Introduce example of text classification (GLUE tasks) with Hugging Face's Transformers and datasets (PR #92)

Minor updates

  • Update README (PR #93)
  • Allow non-function collator and make filtering optimizer's params optional (PR #89)
torchdistill - Example updates

Published by yoshitomo-matsubara over 3 years ago

Example updates

  • Add an example to show how to import models via PyTorch Hub (PR #83)
  • Add an option to set random seed for reproducibility (PR #85)
  • Add an example of segmentation model training (PR #86)

Restructuring

  • Refactor function util (PR #84)

Typo fixes

  • Fix typos in dataset util and examples (PR #88)
torchdistill - Minor updates and bug fixes

Published by yoshitomo-matsubara over 3 years ago

Minor updates

  • Make IoU type selection model-free (PR #74)
  • Update loss string (PR #74)
  • Disable DDP when no params are updatable (PR #77)
  • Update README (PR #78)

Bug/Typo fixes

  • Fix typos in example commands (PR #76)
  • Fix typos in sample configs (PR #79)
  • Fix bugs for clip grad norm (PR #80)
torchdistill - Minor updates and bug fixes

Published by yoshitomo-matsubara over 3 years ago

Minor updates

  • Update functions for object detection models (PR #59)
  • Update README (PRs #61 #62)

Minor bug fixes

  • Rename (PR #60)
  • Bug fixes (PR #73)