Paper Reading AI Learner

Contrastive Learning Method for Sequential Recommendation based on Multi-Intention Disentanglement

2024-04-28 15:13:36
Zeyu Hu, Yuzhi Xiao, Tao Huang, Xuanrong Huo

Abstract

Sequential recommendation is one of the important branches of recommender system, aiming to achieve personalized recommended items for the future through the analysis and prediction of users' ordered historical interactive behaviors. However, along with the growth of the user volume and the increasingly rich behavioral information, how to understand and disentangle the user's interactive multi-intention effectively also poses challenges to behavior prediction and sequential recommendation. In light of these challenges, we propose a Contrastive Learning sequential recommendation method based on Multi-Intention Disentanglement (MIDCL). In our work, intentions are recognized as dynamic and diverse, and user behaviors are often driven by current multi-intentions, which means that the model needs to not only mine the most relevant implicit intention for each user, but also impair the influence from irrelevant intentions. Therefore, we choose Variational Auto-Encoder (VAE) to realize the disentanglement of users' multi-intentions, and propose two types of contrastive learning paradigms for finding the most relevant user's interactive intention, and maximizing the mutual information of positive sample pairs, respectively. Experimental results show that MIDCL not only has significant superiority over most existing baseline methods, but also brings a more interpretable case to the research about intention-based prediction and recommendation.

Abstract (translated)

序列推荐是推荐系统的一个重要分支,通过分析预测用户的历史交互行为,旨在为用户未来的个性化推荐。然而,随着用户量的增长和行为信息的日益丰富,如何有效地理解和区分用户的多种意图也带来了挑战,影响了行为预测和序列推荐。在这些挑战面前,我们提出了基于多意图区分(MIDCL)的对比学习序列推荐方法。在我们的工作中,意图被认为是动态和多样化的,用户的行为通常是由当前的多种意图驱动的,这意味着模型不仅需要为每个用户挖掘最相关的隐含意图,还需要削弱无关意图的影响。因此,我们选择变分自编码器(VAE)来实现用户多意图的区分,并分别提出两种对比学习范式,用于找到最相关的用户交互意图和最大化正样本对之间的互信息。实验结果表明,MIDCL不仅在大多数现有基线方法中具有显著的优越性,而且为意图基于预测和推荐的研究带来了更可解释的案例。

URL

https://arxiv.org/abs/2404.18214

PDF

https://arxiv.org/pdf/2404.18214.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot