grokfast-pytorch

Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"

MIT License

Downloads

180

Stars

Committers

View Code on GitHub

Ecosystems: Python

Commit Statistics

Past Year

All Time

Total Commits

Total Committers

Avg. Commits Per Committer

20.0

Bot Commits

Issue Statistics

Past Year

All Time

Total Pull Requests

Merged Pull Requests

Total Issues

Time to Close Issues

N/A

Package Rankings

Top 35.74% on Pypi.org

Related Projects

meshgpt-pytorch

Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch

29 Nov 2023 642

byol-pytorch

Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in ...

16 Jun 2020 1,687

PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architectu...

09 Dec 2022 7,595

progen

Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax

09 Jun 2021 109

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed conf...

30 Oct 2020 7,759

phasic-policy-gradient

An implementation of Phasic Policy Gradient, a proposed improvement of Proximal Policy Gradients,...

27 Sep 2020 42

lion-pytorch

🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly bet...

15 Feb 2023 2,018

iTransformer

Unofficial implementation of iTransformer - SOTA Time Series Forecasting using Attention networks...

11 Oct 2023 429

adam-atan2-pytorch

Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch

30 Jul 2024 91

textgrad

TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate ...

11 Jun 2024 1,603

q-transformer

Implementation of Q-Transformer, Scalable Offline Reinforcement Learning via Autoregressive Q-Fun...

20 Sep 2023 338

GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

07 Mar 2024 1,179

CLsurvey

Continual Hyperparameter Selection Framework. Compares 11 state-of-the-art Lifelong Learning meth...

06 Apr 2020 192

gigagan-pytorch

Implementation of GigaGAN, new SOTA GAN out of Adobe. Culmination of nearly a decade of research ...

10 Mar 2023 1,813

GLM

GLM (General Language Model)

18 Mar 2021 3,170