Abstract
One fundamental challenge in building an instance segmentation model for a large number of classes in complex scenes is the lack of training examples, especially for rare objects. In this paper, we explore the possibility to increase the training examples without laborious data collection and annotation. We find that an abundance of instance segments can potentially be obtained freely from object-centric im-ages, according to two insights: (i) an object-centric image usually contains one salient object in a simple background; (ii) objects from the same class often share similar appearances or similar contrasts to the background. Motivated by these insights, we propose a simple and scalable framework FreeSeg for extracting and leveraging these "free" object foreground segments to facilitate model training in long-tailed instance segmentation. Concretely, we employ off-the-shelf object foreground extraction techniques (e.g., image co-segmentation) to generate instance mask candidates, followed by segments refinement and ranking. The resulting high-quality object segments can be used to augment the existing long-tailed dataset, e.g., by copying and pasting the segments onto the original training images. On the LVIS benchmark, we show that FreeSeg yields substantial improvements on top of strong baselines and achieves state-of-the-art accuracy for segmenting rare object categories.
Abstract (translated)
URL
https://arxiv.org/abs/2202.11124