Paper Reading AI Learner

NeSHFS: Neighborhood Search with Heuristic-based Feature Selection for Click-Through Rate Prediction

2024-09-13 10:43:18
Dogukan Aksu, Ismail Hakki Toroslu, Hasan Davulcu

Abstract

Click-through-rate (CTR) prediction plays an important role in online advertising and ad recommender systems. In the past decade, maximizing CTR has been the main focus of model development and solution creation. Therefore, researchers and practitioners have proposed various models and solutions to enhance the effectiveness of CTR prediction. Most of the existing literature focuses on capturing either implicit or explicit feature interactions. Although implicit interactions are successfully captured in some studies, explicit interactions present a challenge for achieving high CTR by extracting both low-order and high-order feature interactions. Unnecessary and irrelevant features may cause high computational time and low prediction performance. Furthermore, certain features may perform well with specific predictive models while underperforming with others. Also, feature distribution may fluctuate due to traffic variations. Most importantly, in live production environments, resources are limited, and the time for inference is just as crucial as training time. Because of all these reasons, feature selection is one of the most important factors in enhancing CTR prediction model performance. Simple filter-based feature selection algorithms do not perform well and they are not sufficient. An effective and efficient feature selection algorithm is needed to consistently filter the most useful features during live CTR prediction process. In this paper, we propose a heuristic algorithm named Neighborhood Search with Heuristic-based Feature Selection (NeSHFS) to enhance CTR prediction performance while reducing dimensionality and training time costs. We conduct comprehensive experiments on three public datasets to validate the efficiency and effectiveness of our proposed solution.

Abstract (translated)

点击率(CTR)预测在在线广告和广告推荐系统中起着重要作用。在过去的十年里,最大化CTR一直是最主要的模型发展和解决方案创建的重点。因此,研究人员和实践者提出了各种模型和解决方案来提高CTR预测的有效性。大部分现有文献关注于捕捉隐含或显含特征之间的相互作用。尽管在某些研究中隐含相互作用成功地被捕捉到,但通过提取低阶和高阶特征交互来获得高CTR仍然具有挑战性。不必要的和不相关的特征可能会导致高计算时间和低预测性能。此外,某些特征可能会在特定的预测模型上表现出色,而在其他模型上表现不佳。另外,由于流量变化,特征分布可能会波动。在实时生产环境中,资源有限,推理时间同样至关重要。由于以上原因,特征选择是增强CTR预测模型性能的最重要因素之一。简单的滤波器基础特征选择算法表现不佳,而且它们并不足够有效。需要一种有效且高效的特征选择算法来在实时CTR预测过程中持续过滤最有用的特征。在本文中,我们提出了名为Neighborhood Search with Heuristic-based Feature Selection(NeSHFS)的启发式算法,以提高CTR预测性能并降低维度和训练时间成本。我们对三个公共数据集进行了全面的实验,以验证我们提出的解决方案的有效性和有效性。

URL

https://arxiv.org/abs/2409.08703

PDF

https://arxiv.org/pdf/2409.08703.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot