Abstract
Due to its fast retrieval and storage efficiency capabilities, hashing has been widely used in nearest neighbor retrieval tasks. By using deep learning based techniques, hashing can outperform non-learning based hashing in many applications. However, there are some limitations to previous learning based hashing methods (e.g., the learned hash codes are not discriminative due to the hashing methods being unable to discover rich semantic information and the training strategy having difficulty optimizing the discrete binary codes). In this paper, we propose a novel learning based hashing method, named \textbf{\underline{A}}symmetric \textbf{\underline{D}}eep \textbf{\underline{S}}emantic \textbf{\underline{Q}}uantization (\textbf{ADSQ}). \textbf{ADSQ} is implemented using three stream frameworks, which consists of one \emph{LabelNet} and two \emph{ImgNets}. The \emph{LabelNet} leverages three fully-connected layers, which is used to capture rich semantic information between image pairs. For the two \emph{ImgNets}, they each adopt the same convolutional neural network structure, but with different weights (i.e., asymmetric convolutional neural networks). The two \emph{ImgNets} are used to generate discriminative compact hash codes. Specifically, the function of the \emph{LabelNet} is to capture rich semantic information that is used to guide the two \emph{ImgNets} in minimizing the gap between the real-continuous features and discrete binary codes. By doing this, \textbf{ADSQ} can make full use of the most critical semantic information to guide the feature learning process and consider the consistency of the common semantic space and Hamming space. Results from our experiments demonstrate that \textbf{ADSQ} can generate high discriminative compact hash codes and it outperforms current state-of-the-art methods on three benchmark datasets, CIFAR-10, NUS-WIDE, and ImageNet.
Abstract (translated)
哈希算法以其快速的检索和存储效率,在最近邻检索任务中得到了广泛的应用。通过使用基于深度学习的技术,哈希在许多应用程序中可以优于非基于学习的哈希。然而,现有的基于学习的散列方法存在一些局限性(例如,由于散列方法无法发现丰富的语义信息,训练策略难以优化离散二进制代码,因此所学习的散列代码不具有识别性)。在本文中,我们提出了一种新的基于学习的哈希方法,命名为 extbf underline a对称 extbf underline d eep extbf underline s Emantic extbf underline q uanitization( extbf adsq)。 textbf adsq使用三个流框架实现,该框架由一个emph labelnet和两个emph imgnets组成。emph labelnet利用三个完全连接的层,用于捕获图像对之间丰富的语义信息。对于这两个模型,它们都采用相同的卷积神经网络结构,但权重不同(即非对称卷积神经网络)。这两个emph imgnets用于生成具有识别性的压缩散列码。具体来说,labelnet的功能是捕获丰富的语义信息,用于引导两个imgnets最小化实际连续特征和离散二进制代码之间的差距。这样,textbf adsq可以充分利用最关键的语义信息来指导特征学习过程,并考虑公共语义空间和汉明空间的一致性。实验结果表明,在三个基准数据集(cifar-10、nus-wide和imagenet)上,textbf adsq可以产生高识别性的紧凑散列码,其性能优于当前最先进的方法。
URL
https://arxiv.org/abs/1903.12493