Next-generation Video instance recognition framework on top of Detectron2 which supports InstMove (CVPR 2023), SeqFormer(ECCV Oral), and IDOL(ECCV Oral))
APACHE-2.0 License
To date, VNext contains the official implementation of the following algorithms:
InstMove: Instance Motion for Object-centric Video Segmentation (CVPR 2023)
IDOL: In Defense of Online Models for Video Instance Segmentation (ECCV2022 Oral)
SeqFormer: Sequential Transformer for Video Instance Segmentation (ECCV2022 Oral)
In Defense of Online Models for Video Instance Segmentation
Junfeng Wu, Qihao Liu, Yi Jiang, Song Bai, Alan Yuille, Xiang Bai
In recent years, video instance segmentation (VIS) has been largely advanced by offline models, while online models are usually inferior to the contemporaneous offline models by over 10 AP, which is a huge drawback.
By dissecting current online models and offline models, we demonstrate that the main cause of the performance gap is the error-prone association and propose IDOL, which outperforms all online and offline methods on three benchmarks.
IDOL won first place in the video instance segmentation track of the 4th Large-scale Video Object Segmentation Challenge (CVPR2022).
SeqFormer: Sequential Transformer for Video Instance Segmentation
Junfeng Wu, Yi Jiang, Song Bai, Wenqing Zhang, Xiang Bai
SeqFormer locates an instance in each frame and aggregates temporal information to learn a powerful representation of a video-level instance, which is used to predict the mask sequences on each frame dynamically.
SeqFormer is a robust, accurate, neat offline model and instance tracking is achieved naturally without tracking branches or post-processing.
@inproceedings{seqformer,
title={SeqFormer: Sequential Transformer for Video Instance Segmentation},
author={Wu, Junfeng and Jiang, Yi and Bai, Song and Zhang, Wenqing and Bai, Xiang},
booktitle={ECCV},
year={2022},
}
@inproceedings{IDOL,
title={In Defense of Online Models for Video Instance Segmentation},
author={Wu, Junfeng and Liu, Qihao and Jiang, Yi and Bai, Song and Yuille, Alan and Bai, Xiang},
booktitle={ECCV},
year={2022},
}
This repo is based on detectron2, Deformable DETR, VisTR, and IFC Thanks for their wonderful works.