Axial Positional Embedding for Pytorch
MIT License
Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention
Implementation of Axial attention - attending to multi-dimensional data efficiently
Standalone Product Key Memory module in Pytorch - for augmenting Transformer models
Transformer based on a variant of attention that is linear complexity in respect to sequence length
Reformer, the efficient Transformer, in Pytorch