Paper Reading AI Learner

Learning to Rank Normalized Entropy Curves with Differentiable Window Transformation

2023-01-25 07:36:26
Hanyang Liu, Shuai Yang, Feng Qi, Shuaiwen Wang

Abstract

Recent automated machine learning systems often use learning curves ranking models to inform decisions about when to stop unpromising trials and identify better model configurations. In this paper, we present a novel learning curve ranking model specifically tailored for ranking normalized entropy (NE) learning curves, which are commonly used in online advertising and recommendation systems. Our proposed model, self-Adaptive Curve Transformation augmented Relative curve Ranking (ACTR2), features an adaptive curve transformation layer that transforms raw lifetime NE curves into composite window NE curves with the window sizes adaptively optimized based on both the position on the learning curve and the curve's dynamics. We also introduce a novel differentiable indexing method for the proposed adaptive curve transformation, which allows gradients with respect to the discrete indices to flow freely through the curve transformation layer, enabling the learned window sizes to be updated flexibly during training. Additionally, we propose a pairwise curve ranking architecture that directly models the difference between the two learning curves and is better at capturing subtle changes in relative performance that may not be evident when modeling each curve individually as the existing approaches did. Our extensive experiments on a real-world NE curve dataset demonstrate the effectiveness of each key component of ACTR2 and its improved performance over the state-of-the-art.

Abstract (translated)

最近使用的自动化机器学习系统常常使用学习曲线排名模型来指导决策,以确定何时停止未达到预期的实验,并识别更好的模型配置。在本文中,我们提出了一种专门定制的学习曲线排名模型,以排名等温熵(NE)学习曲线,这些曲线常用于在线广告和推荐系统中。我们的模型称为自适应曲线增强相对曲线排名(ACTR2),采用自适应曲线转换层,将 raw 生命周期NE曲线转换为组合窗口NE曲线,根据学习曲线的位置和曲线的动态特性,自适应优化窗口大小。我们还介绍了一种新的可区分索引方法,以用于 proposed 自适应曲线转换,该方法允许梯度与离散索引自由流动穿过曲线转换层,使学习窗口大小在训练期间动态更新。此外,我们提出了一对学习曲线排名架构,直接建模两个学习曲线的差异,并更好地捕捉相对表现中的微妙变化,这在单独建模每个曲线时可能无法明显体现。我们在真实的NE曲线数据集上进行广泛的实验,证明了ACTR2每个关键组件的有效性,以及它在最先进的方法中提高的性能。

URL

https://arxiv.org/abs/2301.10443

PDF

https://arxiv.org/pdf/2301.10443.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot