Abstract
As few-shot object detectors are often trained with abundant base samples and fine-tuned on few-shot novel examples,the learned models are usually biased to base classes and sensitive to the variance of novel examples. To address this issue, we propose a meta-learning framework with two novel feature aggregation schemes. More precisely, we first present a Class-Agnostic Aggregation (CAA) method, where the query and support features can be aggregated regardless of their categories. The interactions between different classes encourage class-agnostic representations and reduce confusion between base and novel classes. Based on the CAA, we then propose a Variational Feature Aggregation (VFA) method, which encodes support examples into class-level support features for robust feature aggregation. We use a variational autoencoder to estimate class distributions and sample variational features from distributions that are more robust to the variance of support examples. Besides, we decouple classification and regression tasks so that VFA is performed on the classification branch without affecting object localization. Extensive experiments on PASCAL VOC and COCO demonstrate that our method significantly outperforms a strong baseline (up to 16\%) and previous state-of-the-art methods (4\% in average). Code will be available at: \url{this https URL}
Abstract (translated)
由于 few-shot 对象检测器通常使用大量的基础样本进行训练,并针对 few-shot 新样本进行微调,因此,训练模型通常会倾向于基础类别,并对新样本的变异敏感。为了解决这一问题,我们提出了一种基于两个新的特征聚合方案的meta-learning框架。具体来说,我们首先提出了一种无类别聚合方法(CAA),该方法无论查询和支持特征的类别如何都可以进行聚合。不同类别之间的交互鼓励无类别表示,并减少基础和新类别之间的混淆。基于CAA,我们随后提出了一种变分特征聚合方法(VFA),该方法将支持例子编码为类级别的支持特征,以进行稳健的特征聚合。我们使用变分自编码器来估计类分布,并从分布中更鲁棒地采样变分特征。此外,我们分离了分类和回归任务,从而使 VFA 在分类分支上运行,而不会影响物体定位。在PASCAL VOC和COCO等任务的实验表明,我们的方法 significantly outperforms a strong baseline (up to 16\%) and previous state-of-the-art methods (average 4\%)。代码将位于 url{this https URL}。
URL
https://arxiv.org/abs/2301.13411