Sublinear memory optimization for deep learning. https://arxiv.org/abs/1604.06174
MIT License
Make huge neural nets fit in memory
Explore training for quantized models
DropNeuron: Simplifying the Structure of Deep Neural Networks
Implementation of Bottleneck Transformer in Pytorch
A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Tran...
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relat...
Code snippets created for the PyTorch discussion board
Reproduction of MobileNetV2 using MXNet
Model analyzer in PyTorch
(Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000...
View model summaries in PyTorch!
Hybrid Discriminative-Generative Training via Contrastive Learning
My best practice of training large dataset using PyTorch.
Minimalistic large language model 3D-parallelism training
Implementation of Memformer, a Memory-augmented Transformer, in Pytorch