transframer-pytorch

Implementation of Transframer, Deepmind's U-net + Transformer architecture for up to 30 seconds video generation, in Pytorch

MIT License

Stars

66

Committers

View Code on GitHub

Ecosystems: Python

Transframer - Pytorch (wip)

Implementation of Transframer, Deepmind's U-net + Transformer architecture for up to 30 seconds video generation, in Pytorch

The gist of the paper is the usage of a Unet as a multi-frame encoder, along with a regular transformer decoder cross attending and predicting the rest of the frames. The author builds upon his prior work where images are encoded as sparse discrete cosine transform (DCT) sequences.

I will deviate from the implementation in this paper, using a hierarchical autoregressive transformer, and just a regular resnet block in place of the NF-net block (this design choice is just Deepmind reusing their own code, as NF-net was developed at Deepmind by Brock et al).

Update: On further meditation, there is nothing new in this paper except for generative modeling on DCT representations

Appreciation

This work would not be possible without the generous sponsorship from Stability AI, as well as my other sponsors

Todo

figure out if dct can be directly extracted from images in jpeg format

Citations

@article{Nash2022TransframerAF,
    title   = {Transframer: Arbitrary Frame Prediction with Generative Models},
    author  = {Charlie Nash and Jo{\~a}o Carreira and Jacob Walker and Iain Barr and Andrew Jaegle and Mateusz Malinowski and Peter W. Battaglia},
    journal = {ArXiv},
    year    = {2022},
    volume  = {abs/2203.09494}
}

Related Projects

vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with ...

03 Oct 2020 20,058

block-recurrent-transformer-pytorch

Implementation of Block Recurrent Transformer - Pytorch

07 Feb 2023 212

simple-hierarchical-transformer

Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT

06 Apr 2023 204

uformer-pytorch

Implementation of Uformer, Attention-based Unet, in Pytorch

TimeSformer-pytorch

Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classif...

11 Feb 2021 689

h-transformer-1d

Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning

28 Jul 2021 153

hourglass-transformer-pytorch

Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI

RQ-Transformer

Implementation of RQ Transformer, proposed in the paper "Autoregressive Image Generation using Re...

se3-transformer-pytorch

Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch. This specific repo...

09 Jan 2021 254

make-a-video-pytorch

Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch

29 Sep 2022 1,852

compressive-transformer-pytorch

Pytorch implementation of Compressive Transformers, from Deepmind

24 Jun 2020 155

fast-transformer-pytorch

Implementation of Fast Transformer in Pytorch

23 Aug 2021 171

cross-transformers-pytorch

Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch

x-unet

Implementation of a U-net complete with efficient attention as well as the latest research findings

23 Mar 2022 259

bottleneck-transformer-pytorch

Implementation of Bottleneck Transformer in Pytorch

28 Jan 2021 670