Phil Wang

Working with Attention. It's all we need

Ecosystems: Python, PyTorch, Crystal, Cuda

Projects

med-seg-diff-pytorch

Implementation of MedSegDiff in Pytorch - SOTA medical segmentation using DDPM and filtering of features in fourier space

Python - Released: 23 Nov 2022 - 211

routing-transformer

Fully featured implementation of Routing Transformer

Python - Released: 22 May 2020 - 283

perfusion-pytorch

Implementation of Key-Locked Rank One Editing, from Nvidia AI

Python - Released: 06 Aug 2023 - 228

grokfast-pytorch

Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"

Python - Released: 15 Jun 2024 - 83

flash-attention-jax

Implementation of Flash Attention in Jax

Python - Released: 12 Jul 2022 - 191

Adan-pytorch

Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch

Python - Released: 25 Aug 2022 - 247

rvq-vae-gpt

My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation

Python - Released: 30 Jan 2023 - 79

transformer-in-transformer

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Python - Released: 02 Mar 2021 - 300

frame-averaging-pytorch

Pytorch implementation of a simple way to enable (Stochastic) Frame Averaging for any network

Python - Released: 03 Jun 2024 - 46

triton-transformer

Implementation of a Transformer, but completely in Triton

Python - Released: 08 Sep 2021 - 243

electra-pytorch

A simple and working implementation of Electra, the fastest way to pretrain language models from scratch, in Pytorch

Python - Released: 04 Aug 2020 - 221

jax2torch

Use Jax functions in Pytorch

Python - Released: 26 Oct 2021 - 224

pixel-level-contrastive-learning

Implementation of Pixel-level Contrastive Learning, proposed in the paper "Propagate Yourself", in Pytorch

Python - Released: 20 Nov 2020 - 252

sinkhorn-transformer

Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention

Python - Released: 03 Apr 2020 - 253

metnet3-pytorch

Implementation of MetNet-3, SOTA neural weather model out of Google Deepmind, in Pytorch

Python - Released: 04 Nov 2023 - 198

speculative-decoding

Explorations into some recent techniques surrounding speculative decoding

Python - Released: 27 Aug 2023 - 199

Mega-pytorch

Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena

Python - Released: 23 Sep 2022 - 203

medical-chatgpt

Implementation of ChatGPT, but tailored towards primary care medicine, with the reward being able to collect patient histories in a thorough and efficient manner and come up with a reasonable differential diagnosis

Python - Released: 10 Dec 2022 - 313

nystrom-attention

Implementation of Nyström Self-attention, from the paper Nyströmformer

Python - Released: 11 Feb 2021 - 121

h-transformer-1d

Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning

Python - Released: 28 Jul 2021 - 153

ETSformer-pytorch

Implementation of ETSformer, state of the art time-series Transformer, in Pytorch

Python - Released: 05 Feb 2022 - 147

diffusion-policy

Implementation of Diffusion Policy, Toyota Research's supposed breakthrough in leveraging DDPMs for learning policies for real-world Robotics

Python - Released: 20 Sep 2023 - 92

light-recurrent-unit-pytorch

Implementation of a Light Recurrent Unit in Pytorch

Python - Released: 29 Aug 2024 - 45

PaLM-jax

Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)

Python - Released: 08 Apr 2022 - 184

graph-transformer-pytorch

Implementation of Graph Transformer in Pytorch, for potential use in replicating Alphafold2

Python - Released: 18 Jun 2021 - 197