Abstract
Rapid advancements in continual segmentation have yet to bridge the gap of scaling to large continually expanding vocabularies under compute-constrained scenarios. We discover that traditional continual training leads to catastrophic forgetting under compute constraints, unable to outperform zero-shot segmentation methods. We introduce a novel strategy for semantic and panoptic segmentation with zero forgetting, capable of adapting to continually growing vocabularies without the need for retraining or large memory costs. Our training-free approach, kNN-CLIP, leverages a database of instance embeddings to enable open-vocabulary segmentation approaches to continually expand their vocabulary on any given domain with a single-pass through data, while only storing embeddings minimizing both compute and memory costs. This method achieves state-of-the-art mIoU performance across large-vocabulary semantic and panoptic segmentation datasets. We hope kNN-CLIP represents a step forward in enabling more efficient and adaptable continual segmentation, paving the way for advances in real-world large-vocabulary continual segmentation methods.
Abstract (translated)
快速的在持续分割方面的进步尚未在计算受限的场景中跨越词汇量扩展的鸿沟。我们发现,在计算受限的情况下,传统的持续训练会导致灾难性遗忘,无法超越零散分割方法。我们提出了一种新颖的策略,称为带有零遗忘的语义和全视图分割,可以适应不断增长的词汇量,而无需重新训练或产生大的内存成本。我们的免费训练方法kNN-CLIP利用实例嵌入数据库,使任何给定领域都可以通过一次数据通过实现开放词汇分割方法来扩展其词汇量。同时,只存储最小化计算和内存成本的嵌入。这种方法在大型词汇量语义和全视图分割数据集上实现了最先进的mIoU性能。我们希望kNN-CLIP代表了一种推动更高效、更适应的持续分割向前发展,为现实世界中的大型词汇量持续分割方法的进步铺平道路。
URL
https://arxiv.org/abs/2404.09447