By Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu.
This repo is a official implementation of "GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond" on COCO object detection based on open-mmlab's mmdetection. The core operator GC block could be find here. Many thanks to mmdetection for their simple and clean framework.
Update on 2020/12/07
The extension of GCNet got accepted by TPAMI (PDF).
Update on 2019/10/28
GCNet won the Best Paper Award at ICCV 2019 Neural Architects Workshop!
Update on 2019/07/01
The code is refactored.
More results are provided and all configs could be found in configs/gcnet
.
Notes: Both PyTorch official SyncBN and Apex SyncBN have some stability issues. During training, mAP may drops to zero and back to normal during last few epochs.
Update on 2019/06/03
GCNet is supported by the official mmdetection repo here. Thanks again for open-mmlab's work on open source projects.
GCNet is initially described in arxiv. Via absorbing advantages of Non-Local Networks (NLNet) and Squeeze-Excitation Networks (SENet), GCNet provides a simple, fast and effective approach for global context modeling, which generally outperforms both NLNet and SENet on major benchmarks for various recognition tasks.
@article{cao2019GCNet,
title={GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond},
author={Cao, Yue and Xu, Jiarui and Lin, Stephen and Wei, Fangyun and Hu, Han},
journal={arXiv preprint arXiv:1904.11492},
year={2019}
}
Back-bone | Model | Back-bone Norm | Heads | Context | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|---|---|---|
R50-FPN | Mask | fixBN | 2fc (w/o BN) | - | 1x | 3.9 | 0.453 | 10.6 | 37.3 | 34.2 | model |
R50-FPN | Mask | fixBN | 2fc (w/o BN) | GC(c3-c5, r16) | 1x | 4.5 | 0.533 | 10.1 | 38.5 | 35.1 | model |
R50-FPN | Mask | fixBN | 2fc (w/o BN) | GC(c3-c5, r4) | 1x | 4.6 | 0.533 | 9.9 | 38.9 | 35.5 | model |
R50-FPN | Mask | fixBN | 2fc (w/o BN) | - | 2x | - | - | - | 38.2 | 34.9 | model |
R50-FPN | Mask | fixBN | 2fc (w/o BN) | GC(c3-c5, r16) | 2x | - | - | - | 39.7 | 36.1 | model |
R50-FPN | Mask | fixBN | 2fc (w/o BN) | GC(c3-c5, r4) | 2x | - | - | - | 40.0 | 36.2 | model |
Back-bone | Model | Back-bone Norm | Heads | Context | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|---|---|---|
R50-FPN | Mask | SyncBN | 2fc (w/o BN) | - | 1x | 3.9 | 0.543 | 10.2 | 37.2 | 33.8 | model |
R50-FPN | Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r16) | 1x | 4.5 | 0.547 | 9.9 | 39.4 | 35.7 | model |
R50-FPN | Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r4) | 1x | 4.6 | 0.603 | 9.4 | 39.9 | 36.2 | model |
R50-FPN | Mask | SyncBN | 2fc (w/o BN) | - | 2x | 3.9 | 0.543 | 10.2 | 37.7 | 34.3 | model |
R50-FPN | Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r16) | 2x | 4.5 | 0.547 | 9.9 | 39.7 | 36.0 | model |
R50-FPN | Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r4) | 2x | 4.6 | 0.603 | 9.4 | 40.2 | 36.3 | model |
R50-FPN | Mask | SyncBN | 4conv1fc (SyncBN) | - | 1x | - | - | - | 38.8 | 34.6 | model |
R50-FPN | Mask | SyncBN | 4conv1fc (SyncBN) | GC(c3-c5, r16) | 1x | - | - | - | 41.0 | 36.5 | model |
R50-FPN | Mask | SyncBN | 4conv1fc (SyncBN) | GC(c3-c5, r4) | 1x | - | - | - | 41.4 | 37.0 | model |
Back-bone | Model | Back-bone Norm | Heads | Context | Lr schd | Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | mask AP | Download |
---|---|---|---|---|---|---|---|---|---|---|---|
R101-FPN | Mask | fixBN | 2fc (w/o BN) | - | 1x | 5.8 | 0.571 | 9.5 | 39.4 | 35.9 | model |
R101-FPN | Mask | fixBN | 2fc (w/o BN) | GC(c3-c5, r16) | 1x | 7.0 | 0.731 | 8.6 | 40.8 | 37.0 | model |
R101-FPN | Mask | fixBN | 2fc (w/o BN) | GC(c3-c5, r4) | 1x | 7.1 | 0.747 | 8.6 | 40.8 | 36.9 | model |
R101-FPN | Mask | SyncBN | 2fc (w/o BN) | - | 1x | 5.8 | 0.665 | 9.2 | 39.8 | 36.0 | model |
R101-FPN | Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r16) | 1x | 7.0 | 0.778 | 9.0 | 41.1 | 37.4 | model |
R101-FPN | Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r4) | 1x | 7.1 | 0.786 | 8.9 | 41.7 | 37.6 | model |
X101-FPN | Mask | SyncBN | 2fc (w/o BN) | - | 1x | 7.1 | 0.912 | 8.5 | 41.2 | 37.3 | model |
X101-FPN | Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r16) | 1x | 8.2 | 1.055 | 7.7 | 42.4 | 38.0 | model |
X101-FPN | Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r4) | 1x | 8.3 | 1.037 | 7.6 | 42.9 | 38.5 | model |
X101-FPN | Cascade Mask | SyncBN | 2fc (w/o BN) | - | 1x | - | - | - | 44.7 | 38.3 | model |
X101-FPN | Cascade Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r16) | 1x | - | - | - | 45.9 | 39.3 | model |
X101-FPN | Cascade Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r4) | 1x | - | - | - | 46.5 | 39.7 | model |
X101-FPN | DCN Cascade Mask | SyncBN | 2fc (w/o BN) | - | 1x | - | - | - | 47.1 | 40.4 | model |
X101-FPN | DCN Cascade Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r16) | 1x | - | - | - | 47.9 | 40.9 | model |
X101-FPN | DCN Cascade Mask | SyncBN | 2fc (w/o BN) | GC(c3-c5, r4) | 1x | - | - | - | 47.9 | 40.8 | model |
Notes
GC
denotes Global Context (GC) block is inserted after 1x1 conv of backbone.DCN
denotes replace 3x3 conv with 3x3 Deformable Convolution in c3-c5
stages of backbone.r4
and r16
denote ratio 4 and ratio 16 in GC block respectively.a. Install PyTorch 1.1 and torchvision following the official instructions.
b. Install latest apex with CUDA and C++ extensions following this instructions. The Sync BN implemented by apex is required.
c. Clone the GCNet repository.
git clone https://github.com/xvjiarui/GCNet.git
d. Compile cuda extensions.
cd GCNet
pip install cython # or "conda install cython" if you prefer conda
./compile.sh # or "PYTHON=python3 ./compile.sh" if you use system python3 without virtual environments
e. Install GCNet version mmdetection (other dependencies will be installed automatically).
python(3) setup.py install # add --user if you want to install it locally
# or "pip install ."
Note: You need to run the last step each time you pull updates from github.
Or you can run python(3) setup.py develop
or pip install -e .
to install mmdetection if you want to make modifications to it frequently.
Please refer to mmdetection install instruction for more details.
As in original mmdetection, distributed training is recommended for either single machine or multiple machines.
./tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> [optional arguments]
Supported arguments are:
To evaluate trained models, output file is required.
python tools/test.py <CONFIG_FILE> <MODEL_PATH> [optional arguments]
Supported arguments are:
.pkl
bbox segm
would evaluate both bounding box and mask AP.