Paper Reading AI Learner

Collaborative Filtering Based on Diffusion Models: Unveiling the Potential of High-Order Connectivity

2024-04-22 14:49:46
Yu Hou, Jin-Duk Park, Won-Yong Shin

Abstract

A recent study has shown that diffusion models are well-suited for modeling the generative process of user-item interactions in recommender systems due to their denoising nature. However, existing diffusion model-based recommender systems do not explicitly leverage high-order connectivities that contain crucial collaborative signals for accurate recommendations. Addressing this gap, we propose CF-Diff, a new diffusion model-based collaborative filtering (CF) method, which is capable of making full use of collaborative signals along with multi-hop neighbors. Specifically, the forward-diffusion process adds random noise to user-item interactions, while the reverse-denoising process accommodates our own learning model, named cross-attention-guided multi-hop autoencoder (CAM-AE), to gradually recover the original user-item interactions. CAM-AE consists of two core modules: 1) the attention-aided AE module, responsible for precisely learning latent representations of user-item interactions while preserving the model's complexity at manageable levels, and 2) the multi-hop cross-attention module, which judiciously harnesses high-order connectivity information to capture enhanced collaborative signals. Through comprehensive experiments on three real-world datasets, we demonstrate that CF-Diff is (a) Superior: outperforming benchmark recommendation methods, achieving remarkable gains up to 7.29% compared to the best competitor, (b) Theoretically-validated: reducing computations while ensuring that the embeddings generated by our model closely approximate those from the original cross-attention, and (c) Scalable: proving the computational efficiency that scales linearly with the number of users or items.

Abstract (translated)

最近的一项研究表明,扩散模型在推荐系统用户物品交互的生成过程中具有很好的适用性,因为它们的去噪性质。然而,现有的基于扩散模型的推荐系统并没有明确利用高阶连接性,这些高阶连接性包含了对准确推荐至关重要的合作信号。为解决这个空白,我们提出了CF-Diff,一种基于扩散模型的合作过滤(CF)方法,它能够充分利用合作信号和多级邻居。具体来说,前扩散过程对用户物品交互添加随机噪声,而反扩散过程则适应我们自己的学习模型,名为跨注意力和多级循环自动编码器(CAM-AE),逐渐恢复原始用户物品交互。CAM-AE由两个核心模块组成:1)负责精确学习用户物品交互的潜在表示,同时保持模型的复杂度在可管理水平,并保留模型的复杂性的自注意力辅助AE模块;2)是一个多级跨注意力模块,它谨慎地利用高阶连接性信息来捕捉增强的合作信号。通过对三个真实世界数据集的全面实验,我们证明了CF-Diff具有以下优势:(a)优越:超越了基准推荐方法,实现了最高达7.29%的显著提高,与最佳竞争者相比;(b)理论上有验证:在确保我们的模型生成的嵌入与原始跨注意力的嵌入接近的情况下减少计算;(c)可扩展性:证明了计算效率与用户数量或物品数量成线性关系。

URL

https://arxiv.org/abs/2404.14240

PDF

https://arxiv.org/pdf/2404.14240.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot