Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve existing models like BERT.
MIT License
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch
Unsupervised Language Modeling at scale for robust sentiment classification
Unofficial implementation of iTransformer - SOTA Time Series Forecasting using Attention networks...
Code for NAACL 2024 main conference paper "An Empirical Study of Consistency Regularization for E...
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Py...
code for the ICLR'22 paper: On Robust Prefix-Tuning for Text Classification
Code for paper Fine-tune BERT for Extractive Summarization
Train vision models using JAX and 🤗 transformers
GLM (General Language Model)
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
Chain-of-Hindsight, A Scalable RLHF Method
A concise but complete implementation of CLIP with various experimental improvements from recent ...
Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Re...