Abstract
Instance perception tasks (object detection, instance segmentation, pose estimation, counting) play a key role in industrial applications of visual models. As supervised learning methods suffer from high labeling cost, few-shot learning methods which effectively learn from a limited number of labeled examples are desired. Existing few-shot learning methods primarily focus on a restricted set of tasks, presumably due to the challenges involved in designing a generic model capable of representing diverse tasks in a unified manner. In this paper, we propose UniFS, a universal few-shot instance perception model that unifies a wide range of instance perception tasks by reformulating them into a dynamic point representation learning framework. Additionally, we propose Structure-Aware Point Learning (SAPL) to exploit the higher-order structural relationship among points to further enhance representation learning. Our approach makes minimal assumptions about the tasks, yet it achieves competitive results compared to highly specialized and well optimized specialist models. Codes will be released soon.
Abstract (translated)
实例感知任务(目标检测、实例分割、姿态估计、计数)在工业视觉模型的应用中扮演着关键角色。由于监督学习方法受到高标注成本的影响,希望寻求一种有效的少样本学习方法,可以从有限的标记示例中高效学习。现有的少样本学习方法主要集中于一个受限的任务集,可能是由于设计一个通用模型来表示多样任务的挑战性较大。在本文中,我们提出了UniFS,一种统一少样本实例感知模型,通过将它们重新解释为一个动态点表示学习框架,将广泛的实例感知任务统一起来。此外,我们还提出了结构感知点学习(SAPL)来利用点之间的更高阶结构关系进一步增强表示学习。我们的方法对任务的需求相对较低,然而,与高度专业化和优化良好的专家模型相比,其竞争结果具有优势。代码不久将发布。
URL
https://arxiv.org/abs/2404.19401