[CADL'22, ECCVW] Official repository of paper titled "EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications".
MIT License
Train vision models using JAX and 🤗 transformers
[ECCV 2024] Official repository of "GenView: Enhancing View Quality with Pretrained Generative Mo...
MambaOut: Do We Really Need Mamba for Vision?
This is a collection of our NAS and Vision Transformer work.
Generic image compressor for machine learning. Pytorch code for our paper "Lossy compression for ...
Dynamic Token Expansion with Continual Transformers, accepted at CVPR 2022
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities...
A PyTorch implementation of "Real-time Scene Text Detection with Differentiable Binarization".
EfficientViT is a new family of vision models for efficient high-resolution vision.
[NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual R...
Code release for ConvNeXt V2 model
[CVPR 2021] Code for "Augmentation Strategies for Learning with Noisy Labels".
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregress...
Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)