4M: Massively Multimodal Masked Modeling
APACHE-2.0 License
Train high-quality text-to-image diffusion models in a data & compute efficient manner
This is an official implementation for "AutoFocusFormer: Image Segmentation off the Grid".