Abstract
Remote Sensing Image Retrieval remains a challenging topic due to the special nature of Remote Sensing Imagery. Such images contain various different semantic objects, which clearly complicates the retrieval task. In this paper, we present an image retrieval pipeline that uses attentive, local convolutional features and aggregates them using the Vector of Locally Aggregated Descriptors (VLAD) to produce a global descriptor. We study various system parameters such as the multiplicative and additive attention mechanisms and descriptor dimensionality. We propose a query expansion method that requires no external inputs. Experiments demonstrate that even without training, the local convolutional features and global representation outperform other systems. After system tuning, we can achieve state-of-the-art or competitive results. Furthermore, we observe that our query expansion method increases overall system performance by about 3%, using only the top-three retrieved images. Finally, we show how dimensionality reduction produces compact descriptors with increased retrieval performance and fast retrieval computation times, e.g. 50% faster than the current systems.
Abstract (translated)
由于遥感图像的特殊性,遥感图像检索一直是一个具有挑战性的课题。这些图像包含各种不同的语义对象,这明显使检索任务复杂化。本文提出了一种基于局部卷积特征的图像检索流水线,利用局部卷积描述符(VLAD)的向量对其进行聚合,生成一个全局描述符。我们研究了各种系统参数,如乘法和加法注意机制和描述符维数。我们提出了一种不需要外部输入的查询扩展方法。实验表明,即使没有训练,局部卷积特征和全局表示也优于其他系统。经过系统调整,我们可以达到最先进或具有竞争力的结果。此外,我们观察到,我们的查询扩展方法只使用前三个检索到的图像,将系统整体性能提高了3%左右。最后,我们展示了维数约简是如何产生紧凑的描述符的,它具有更高的检索性能和更快的检索计算时间,例如比当前系统快50%。
URL
https://arxiv.org/abs/1903.09469