Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
MIT License
Dense Prediction Transformers
VRT: A Video Restoration Transformer (official repository)
High-Resolution Image Synthesis with Latent Diffusion Models
A flexible package for multimodal-deep-learning to combine tabular data with text and images usin...
Diffusers training with mmengine
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Official repo for VGen: a holistic video generation ecosystem for video generation building on di...
Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
High Resolution Depth Maps for Stable Diffusion WebUI
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多...
Official code base of the BEVDet series .
TRI-ML Monocular Depth Estimation Repository