Paper Reading AI Learner

ForestHash: Semantic Hashing With Shallow Random Forests and Tiny Convolutional Networks

2018-07-28 00:06:37
Qiang Qiu, Jose Lezama, Alex Bronstein, Guillermo Sapiro

Abstract

Hash codes are efficient data representations for coping with the ever growing amounts of data. In this paper, we introduce a random forest semantic hashing scheme that embeds tiny convolutional neural networks (CNN) into shallow random forests, with near-optimal information-theoretic code aggregation among trees. We start with a simple hashing scheme, where random trees in a forest act as hashing functions by setting `1' for the visited tree leaf, and `0' for the rest. We show that traditional random forests fail to generate hashes that preserve the underlying similarity between the trees, rendering the random forests approach to hashing challenging. To address this, we propose to first randomly group arriving classes at each tree split node into two groups, obtaining a significantly simplified two-class classification problem, which can be handled using a light-weight CNN weak learner. Such random class grouping scheme enables code uniqueness by enforcing each class to share its code with different classes in different trees. A non-conventional low-rank loss is further adopted for the CNN weak learners to encourage code consistency by minimizing intra-class variations and maximizing inter-class distance for the two random class groups. Finally, we introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. The proposed approach significantly outperforms state-of-the-art hashing methods for image retrieval tasks on large-scale public datasets, while performing at the level of other state-of-the-art image classification techniques while utilizing a more compact and efficient scalable representation. This work proposes a principled and robust procedure to train and deploy in parallel an ensemble of light-weight CNNs, instead of simply going deeper.

Abstract (translated)

哈希码是用于处理不断增长的数据量的有效数据表示。在本文中,我们介绍了一种随机森林语义哈希方案,它将微小卷积神经网络(CNN)嵌入到浅层随机森林中,在树木之间具有接近最优的信息理论代码聚合。我们从一个简单的哈希方案开始,其中森林中的随机树通过为访问树叶设置“1”而其余为“0”来充当哈希函数。我们表明,传统的随机森林无法产生哈希值,这些哈希值保留了树木之间的潜在相似性,使随机森林方法对哈希进行了挑战。为了解决这个问题,我们建议首先将每个树分裂节点的到达类随机分组为两组,从而获得一个明显简化的两类分类问题,可以使用轻量级CNN弱学习器来处理。这种随机类分组方案通过强制每个类与不同树中的不同类共享其代码来实现代码唯一性。 CNN弱学习者进一步采用非传统的低等级损失,以通过最小化类内变化和最大化两个随机类组的类间距离来鼓励代码一致性。最后,我们介绍了一种信息理论方法,用于将单个树的代码聚合为单个哈希码,从而为每个类生成近乎最优的唯一哈希值。所提出的方法明显优于大规模公共数据集上的图像检索任务的最先进的散列方法,同时在其他最先进的图像分类技术的水平上执行,同时利用更紧凑和有效的可扩展性表示。这项工作提出了一个原则性和稳健的程序,以并行训练和部署轻量级CNN集合,而不是简单地更深入。

URL

https://arxiv.org/abs/1711.08364

PDF

https://arxiv.org/pdf/1711.08364.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot