PyTorch reimplementation of FlexiViT: One Model for All Patch Sizes
MIT License
Train high-quality text-to-image diffusion models in a data & compute efficient manner
Flops counter for convolutional networks in pytorch framework
Toolbox for vision tasks. Pre-trained vision backbones on ImageNet with PyTorch Lightning 🚀
solo-learn: a library of self-supervised methods for visual representation learning powered by Py...
TensorFlow 2.X reimplementation of CvT: Introducing Convolutions to Vision Transformers, Haiping ...
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which i...
PyTorch and TensorFlow/Keras image models with automatic weight conversions and equal API/impleme...
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, e...
Explainability for Vision Transformers
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose E...
A flexible package for multimodal-deep-learning to combine tabular data with text and images usin...
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (...