Paper Reading AI Learner

AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations

2019-05-01 12:58:05
Xiao Zhang, Rui Zhao, Yu Qiao, Xiaogang Wang, Hongsheng Li

Abstract

The cosine-based softmax losses and their variants achieve great success in deep learning based face recognition. However, hyperparameter settings in these losses have significant influences on the optimization path as well as the final recognition performance. Manually tuning those hyperparameters heavily relies on user experience and requires many training tricks. In this paper, we investigate in depth the effects of two important hyperparameters of cosine-based softmax losses, the scale parameter and angular margin parameter, by analyzing how they modulate the predicted classification probability. Based on these analysis, we propose a novel cosine-based softmax loss, AdaCos, which is hyperparameter-free and leverages an adaptive scale parameter to automatically strengthen the training supervisions during the training process. We apply the proposed AdaCos loss to large-scale face verification and identification datasets, including LFW, MegaFace, and IJB-C 1:1 Verification. Our results show that training deep neural networks with the AdaCos loss is stable and able to achieve high face recognition accuracy. Our method outperforms state-of-the-art softmax losses on all the three datasets.

Abstract (translated)

基于余弦的SoftMax损失及其变体在基于深度学习的人脸识别中取得了巨大成功。然而,这些损失中的超参数设置对优化路径和最终识别性能有着重要的影响。手动调整这些超参数很大程度上依赖于用户体验,需要许多培训技巧。本文通过分析两个重要的超参数对预测分类概率的影响,深入研究了基于余弦的软最大损耗的尺度参数和角裕度参数对预测分类概率的影响。基于这些分析,我们提出了一种新的基于余弦的最大软损耗ADACOS,它不含超参数,并利用自适应尺度参数来自动加强训练过程中的训练监控。我们将所提出的ADACOS损失应用于大规模人脸验证和识别数据集,包括LFW、Megaface和IJB-C 1:1验证。结果表明,采用ADACOS损耗训练深度神经网络是稳定的,能够获得较高的人脸识别精度。我们的方法在所有三个数据集上都优于最先进的SoftMax损失。

URL

https://arxiv.org/abs/1905.00292

PDF

https://arxiv.org/pdf/1905.00292.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot