Paper Reading AI Learner

The General Pair-based Weighting Loss for Deep Metric Learning

2019-05-30 02:59:26
Haijun Liu, Jian Cheng, Wen Wang, Yanzhou Su

Abstract

Deep metric learning aims at learning the distance metric between pair of samples, through the deep neural networks to extract the semantic feature embeddings where similar samples are close to each other while dissimilar samples are farther apart. A large amount of loss functions based on pair distances have been presented in the literature for guiding the training of deep metric learning. In this paper, we unify them in a general pair-based weighting loss function, where the minimizing objective loss is just the distances weighting of informative pairs. The general pair-based weighting loss includes two main aspects, (1) samples mining and (2) pairs weighting. Samples mining aims at selecting the informative positive and negative pair sets to exploit the structured relationship of samples in a mini-batch and also reduce the number of non-trivial pairs. Pair weighting aims at assigning different weights for different pairs according to the pair distances for discriminatively training the network. We detailedly review those existing pair-based losses inline with our general loss function, and explore some possible methods from the perspective of samples mining and pairs weighting. Finally, extensive experiments on three image retrieval datasets show that our general pair-based weighting loss obtains new state-of-the-art performance, demonstrating the effectiveness of the pair-based samples mining and pairs weighting for deep metric learning.

Abstract (translated)

深度度量学习的目的是通过深度神经网络学习两个样本之间的距离度量,提取相似样本彼此接近而不同样本相距较远的语义特征嵌入。为指导深度量学习的训练,文献中提出了大量基于对距离的损失函数。在本文中,我们将它们统一到一个基于对的加权损失函数中,其中最小目标损失就是信息对的距离加权。一般的基于对的加权损失包括两个主要方面:(1)样本挖掘和(2)对加权。样本挖掘的目的是选择信息性的正负对集,利用小批量样本的结构化关系,减少非平凡对的数量。对权重的目的是根据对的距离为不同的对分配不同的权重,以便对网络进行有区别的训练。我们根据我们的一般损失函数详细地回顾了现有的基于对的损失,并从样本挖掘和对加权的角度探讨了一些可能的方法。最后,对三个图像检索数据集进行了大量的实验,结果表明,我们的基于对的一般加权损失获得了最新的性能,证明了基于对的样本挖掘和对加权在深度度量学习中的有效性。

URL

https://arxiv.org/abs/1905.12837

PDF

https://arxiv.org/pdf/1905.12837.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot