This repository contains the official implementation to reproduce object detection results of ViP.
APACHE-2.0 License
This repository contains the official implementation to reproduce object detection results of ViP. It is based on mmdetection.
Backbone | Pretrain | Lr Schd | box mAP | mask mAP | #params | FLOPs | config | log | model |
---|---|---|---|---|---|---|---|---|---|
ViP-Ti | ImageNet-1K | 1x | 45.3 | 39.8 | 69.2M | 678G | config | Google Drive | Google Drive |
ViP-S | ImageNet-1K | 1x | 48.0 | 42.0 | 87.1M | 725G | config | Google Drive | Google Drive |
ViP-M | ImageNet-1K | 1x | 49.9 | 43.5 | 107.0M | 785G | - | - | Coming Soon |
Backbone | Pretrain | Lr Schd | box mAP | #params | FLOPs | config | log | model |
---|---|---|---|---|---|---|---|---|
ViP-Ti | ImageNet-1k | 1x | 39.9 | 21.4M | 181G | config | Google Drive | Google Drive |
ViP-S | ImageNet-1k | 1x | 42.7 | 39.9M | 227G | config | Google Drive | Google Drive |
ViP-S | ImageNet-1k | 3x | 43.9 | 39.9M | 227G | config | Google Drive | Google Drive |
ViP-M | ImageNet-1k | 1x | 44.3 | 59.8M | 287G | - | - | Coming Soon |
Notes:
Please refer to get_started.md for installation and dataset preparation.
# single-gpu testing
python tools/test.py <CONFIG_FILE> <DET_CHECKPOINT_FILE> --eval bbox segm
# multi-gpu testing
tools/dist_test.sh <CONFIG_FILE> <DET_CHECKPOINT_FILE> <GPU_NUM> --eval bbox segm
To train a detector with pre-trained models, run:
# single-gpu training
python tools/train.py <CONFIG_FILE>
# multi-gpu training
tools/dist_train.sh <CONFIG_FILE> <GPU_NUM>
@article{sun2021visual,
title={Visual Parser: Representing Part-whole Hierarchies with Transformers},
author={Sun, Shuyang and Yue, Xiaoyu, Bai, Song and Torr, Philip},
journal={arXiv preprint arXiv:2107.05790},
year={2021}
}