Paper Reading AI Learner

On Predicting Post-Click Conversion Rate via Counterfactual Inference

2025-10-06 13:57:49
Junhyung Ahn, Sanghack Lee

Abstract

Accurately predicting conversion rate (CVR) is essential in various recommendation domains such as online advertising systems and e-commerce. These systems utilize user interaction logs, which consist of exposures, clicks, and conversions. CVR prediction models are typically trained solely based on clicked samples, as conversions can only be determined following clicks. However, the sparsity of clicked instances necessitates the collection of a substantial amount of logs for effective model training. Recent works address this issue by devising frameworks that leverage non-clicked samples. While these frameworks aim to reduce biases caused by the discrepancy between clicked and non-clicked samples, they often rely on heuristics. Against this background, we propose a method to counterfactually generate conversion labels for non-clicked samples by using causality as a guiding principle, attempting to answer the question, "Would the user have converted if he or she had clicked the recommended item?" Our approach is named the Entire Space Counterfactual Inference Multi-task Model (ESCIM). We initially train a structural causal model (SCM) of user sequential behaviors and conduct a hypothetical intervention (i.e., click) on non-clicked items to infer counterfactual CVRs. We then introduce several approaches to transform predicted counterfactual CVRs into binary counterfactual conversion labels for the non-clicked samples. Finally, the generated samples are incorporated into the training process. Extensive experiments on public datasets illustrate the superiority of the proposed algorithm. Online A/B testing further empirically validates the effectiveness of our proposed algorithm in real-world scenarios. In addition, we demonstrate the improved performance of the proposed method on latent conversion data, showcasing its robustness and superior generalization capabilities.

Abstract (translated)

准确预测转化率(CVR)在在线广告系统和电子商务等推荐领域中至关重要。这些系统依赖于用户互动日志,包括展示、点击和转换记录。然而,由于仅能在用户点击后确定是否发生转换,因此CVR预测模型通常只基于点击样本进行训练。这导致了点击样本的稀疏性问题,需要收集大量的日志数据以支持有效的模型训练。最近的研究试图通过设计利用未被点击样本来解决这一问题的框架来应对这种情况。尽管这些框架旨在减少由于点击和未点击样本之间的差异而导致的偏见,但它们往往依赖于启发式方法。 在这种背景下,我们提出了一种基于因果关系原理生成非点击样本转化标签的方法,以回答“如果用户点击了推荐项目,他或她是否会完成转换?”这一问题。我们的方法称为全空间反事实推理多任务模型(ESCIM)。首先,我们训练一个描述用户行为序列的结构因果模型(SCM),然后对未被点击的商品进行假设性干预(即点击操作)以推断出反事实CVR值。接着,我们引入了几种将预测到的反事实CVR转换为非点击样本二元转化标签的方法。最后,这些生成的数据被纳入训练过程。 在公开数据集上的广泛实验表明了所提出算法的优势,并通过在线A/B测试进一步证明了该方法在现实世界场景中的有效性。此外,我们还展示了该方法对潜在转换数据的性能改进,突显出其鲁棒性和优越的泛化能力。

URL

https://arxiv.org/abs/2510.04816

PDF

https://arxiv.org/pdf/2510.04816.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot