Bamboo: 4 times larger than ImageNet; 2 time larger than Object365; Built by active learning.
Benchmarking Generalized Out-of-Distribution Detection
[ICCV2023 oral] Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstru...
Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)
[CVPR 2024] Official Code for "AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation
Code Release of F-LMM: Grounding Frozen Large Multimodal Models
[CVPR 2024 Highlight] Official PyTorch implementation of CoDeF: Content Deformation Fields for Te...
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
VILA - a multi-image visual language model with training, inference and evaluation recipe, deploy...
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large...
[ICCV2019] Robust Multi-Modality Multi-Object Tracking
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Semantic Segmentation for Aerial / Satellite Images with Convolutional Neural Networks including ...
Geom3D: Geometric Modeling on 3D Structures, NeurIPS 2023