Combination of Multiple Global Descriptors for Image Retrieval

Abstract
Abstract (translated)
URL
PDF

Abstract

Recent studies in image retrieval task have shown that ensembling different models and combining multiple global descriptors lead to performance improvement. However, training different models for ensemble is not only difficult but also inefficient with respect to time or memory. In this paper, we propose a novel framework that exploits multiple global descriptors to get an ensemble-like effect while it can be trained in an end-to-end manner. The proposed framework is flexible and expandable by the global descriptor, CNN backbone, loss, and dataset. Moreover, we investigate the effectiveness of combining multiple global descriptors with quantitative and qualitative analysis. Our extensive experiments show that the combined descriptor outperforms a single global descriptor, as it can utilize different types of feature properties. In the benchmark evaluation, the proposed framework achieves the state-of-the-art performance on the CARS196, CUB200-2011, In-shop Clothes and Stanford Online Products on image retrieval tasks by a large margin compared to competing approaches.

Abstract (translated)

最近在图像检索任务中的研究表明，将不同的模型组合在一起，并结合多个全局描述符，可以提高性能。然而，训练不同的合奏模式不仅困难，而且在时间或记忆方面效率低下。在本文中，我们提出了一个新的框架，它利用多个全局描述符来获得一个类似于集合的效果，同时可以对其进行端到端的训练。该框架具有灵活性和可扩展性，包括全局描述符、CNN主干网、丢失和数据集。此外，我们还研究了多个全局描述符与定量和定性分析相结合的有效性。我们的大量实验表明，组合描述符优于单个全局描述符，因为它可以利用不同类型的特征属性。在基准评估中，与竞争性方法相比，该框架在CAR196、CUB200-2011、店内服装和斯坦福在线产品上的图像检索任务上实现了最先进的性能。

URL

https://arxiv.org/abs/1903.10663

PDF

https://arxiv.org/pdf/1903.10663.pdf