Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch
MIT License
Explorations into the recently proposed Taylor Series Linear Attention
Adversarially Learned Inference in Pytorch
AdamW optimizer for bfloat16 models in pytorch 🔥.
Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow ...
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architectu...
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling wit...
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Py...
Unofficial implementation of iTransformer - SOTA Time Series Forecasting using Attention networks...
Implementation of Soft Actor Critic and some of its improvements in Pytorch
An implementation of Phasic Policy Gradient, a proposed improvement of Proximal Policy Gradients,...
DALL·E Mini - Generate images from a text prompt
Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch
🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly bet...
Fast and flexible AutoML with learning guarantees.