fairscale

PyTorch extensions for high performance and large scale training.

OTHER License

Stars

3.2K

Committers

View Code on GitHub

Ecosystems: Python

Bot releases are visible (Hide)

fairscale -

Published by min-xu-ai over 3 years ago

fairscale -

Published by min-xu-ai over 3 years ago

fairscale -

Published by min-xu-ai over 3 years ago

fairscale - v0.3.0

Published by blefaudeux over 3 years ago

[0.3.0] - 2021-02-22

Added

FullyShardedDataParallel (FSDP) (#413)
ShardedDDP fp16 grad reduction option (#402)
Expose experimental algorithms within the pip package (#410)

Fixed

Catch corner case when the model is too small with respect to the world size, and shards are empty (#406)
Memory leak in checkpoint_wrapper (#412)

fairscale - v0.1.7

Published by blefaudeux over 3 years ago

Fixed

ShardedDDP and OSS handle model trainability changes during training (#369)
ShardedDDP state dict load/save bug (#386)
ShardedDDP handle train/eval modes (#393)
AdaScale handling custom scaling factors (#401)

Added

ShardedDDP manual reduce option for checkpointing (#389)

fairscale - v0.1.6

Published by blefaudeux over 3 years ago

Added

Checkpointing model wrapper (#376)
Faster OSS, flatbuffers (#371)
Small speedup in OSS clipgradnorm (#363)

Fixed

Bug in ShardedDDP with 0.1.5 depending the init (KeyError / OSS)
Much refactoring in Pipe (#357, #358, #360, #362, #370, #373)
Better pip integration / resident pytorch (#375)

fairscale - v0.1.5

Published by blefaudeux over 3 years ago

Added

Pytorch compatibility for OSS checkpoints (#310)
Elastic checkpoints for OSS, world size can vary in between save and loads (#310)
Tensor views for OSS bucketing, reduced CPU use (#300)
Bucket calls in ShardedDDP, for faster inter node communications (#327)
FlattenParamWrapper, which flattens module parameters into a single tensor seamlessly (#317)
AMPnet experimental support (#304)

Fixed

ShardedDDP properly handles device changes via .to() (#353)
Add a new interface for AdaScale, AdaScaleWrapper, which makes it compatible with OSS (#347)

fairscale - v0.1.4

Published by blefaudeux almost 4 years ago

Fixed

Missing cu files in the pip package

fairscale - v0.1.3

Published by blefaudeux almost 4 years ago

Same as 0.1.2, but with the correct numbering in the source code (see init.py)

fairscale - v0.1.2

Published by blefaudeux almost 4 years ago

Added

AdaScale: Added gradient accumulation feature (#202)
AdaScale: Added support of torch.lr_scheduler (#229)

Fixed

AdaScale: smoothing factor value fixed when using gradient accumulation (#235)
Pipe: documentation on balancing functions (#243)
ShardedDDP: handle typical NLP models
ShardedDDP: better partitioning when finetuning

fairscale - v0.1.1

Published by msbaines almost 4 years ago

Fixed

make sure pip package includes header files (#221)

fairscale - v0.1.0

Published by msbaines almost 4 years ago

Added

ShardedDataParallel with autoreduce (#157)
cpu support for Pipe (#188)
ShardedOptim: Distributed Grad Scaler (for torch AMP) (#182)
OSS-aware clip grads, bridge sharded states (#167)
oss: add rank_local_state_dict staticmethod (#174)
support for PyTorch 1.7.0 (#171)
Add implementation of AdaScale (#139)

Fixed

pip package install (#196, #200)

fairscale - v0.0.3

Published by msbaines almost 4 years ago

Added

multi-process pipe (#90)

Fixed

OSS+apex fix (#136)
MegaTron+OSS DDP fix (#121)

fairscale - v0.0.2

Published by msbaines almost 4 years ago

Added

add ddp that works with oss with reduce() not all_reduce() (#19)
support for PyTorch v1.6
add mixed precision Adam (#40)
Adam optimizer state scaling (#44)

Fixed

properly restore a sharded optim state (#39)
OSS restore state to proper device (#46)
optim/oss: support optimizers with additional step kwargs (#53)
optim/oss: fix state cast (#56)
fix eval for oss_ddp (#55)
optim/oss: work correctly with LRScheduler (#58)

fairscale - v0.0.1

Published by msbaines about 4 years ago

Initial release.

Package Rankings

Top 6.75% on Proxy.golang.org

Top 16.92% on Spack.io

Top 15.15% on Conda-forge.org

Badges

Extracted from project README

Support Ukraine

Documentation Status

CircleCI

Explain Like I’m 5: FairScale

Related Projects

gradsflow-automl

An open-source AutoML Library based on PyTorch

11 Aug 2021 306

fvcore

Collection of common code that's shared among different research projects in FAIR computer vision...

25 Sep 2019 1,993

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DP...

30 Jul 2023 2,191

the-incredible-pytorch

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relat...

11 Feb 2017 11,389

TensorLayer

Deep Learning and Reinforcement Learning Library for Scientists and Engineers

07 Jun 2016 7,302

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

25 Nov 2022 15,987

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating poin...

20 Sep 2022 1,482

torchmetrics

Torchmetrics - Machine learning metrics for distributed, scalable PyTorch applications.

22 Dec 2020 1,998

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

13 Oct 2021 7,730

CV-pretrained-model

A collection of computer vision pre-trained models.

14 Jul 2020 1,273

Deep-Learning-in-Production

In this repository, I will share some useful notes and references about deploying deep learning-b...

03 May 2018 4,294

ignite

High-level library to help with training and evaluating neural networks in PyTorch flexibly and t...

23 Nov 2017 4,484

TimeSformer

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video ...

02 Apr 2021 1,518

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed conf...

30 Oct 2020 7,759

sentiment-discovery

Unsupervised Language Modeling at scale for robust sentiment classification

30 Nov 2017 1,061