Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification
MIT License
Implementation of Feedback Transformer in Pytorch
Implementation of the Point Transformer layer, in Pytorch
Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts fro...
Implementation of Agent Attention in Pytorch
Implementation of Bottleneck Transformer in Pytorch
Visual Attention based OCR
Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Py...
Implementation of Transformer in Transformer, pixel level attention paired with patch level atten...
Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Genera...
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with ...
Implementation of MagViT2 Tokenizer in Pytorch
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classif...
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI