Paper Reading AI Learner

TEARS: Textual Representations for Scrutable Recommendations

2024-10-25 04:26:00
Emiliano Penaloza, Olivier Gouvert, Haolun Wu, Laurent Charlin

Abstract

Traditional recommender systems rely on high-dimensional (latent) embeddings for modeling user-item interactions, often resulting in opaque representations that lack interpretability. Moreover, these systems offer limited control to users over their recommendations. Inspired by recent work, we introduce TExtuAl Representations for Scrutable recommendations (TEARS) to address these challenges. Instead of representing a user's interests through a latent embedding, TEARS encodes them in natural text, providing transparency and allowing users to edit them. To do so, TEARS uses a modern LLM to generate user summaries based on user preferences. We find the summaries capture user preferences uniquely. Using these summaries, we take a hybrid approach where we use an optimal transport procedure to align the summaries' representation with the learned representation of a standard VAE for collaborative filtering. We find this approach can surpass the performance of three popular VAE models while providing user-controllable recommendations. We also analyze the controllability of TEARS through three simulated user tasks to evaluate the effectiveness of a user editing its summary.

Abstract (translated)

传统的推荐系统依赖于高维(潜在)嵌入来建模用户-项目交互,通常会导致缺乏可解释性的不透明表示。此外,这些系统在用户的推荐控制方面提供的选项有限。受到最近工作的启发,我们引入了文本化可理解推荐表示(TEARS),以解决这些问题。与通过潜在嵌入表示用户兴趣不同,TEARS将其编码为自然语言文本,提供了透明度,并允许用户对其进行编辑。为此,TEARS使用现代大型语言模型根据用户偏好生成用户摘要。我们发现这些摘要能独特地捕捉用户的偏好。利用这些摘要,我们采取了一种混合方法,即通过最优传输程序将摘要的表示与用于协同过滤的标准变分自编码器(VAE)所学习到的表示对齐。我们发现这种方法在性能上可以超越三种流行的 VAE 模型,并且能够提供用户可控的推荐。此外,我们还通过三个模拟用户任务分析了 TEARS 的可控制性,以评估用户编辑其摘要的有效性。

URL

https://arxiv.org/abs/2410.19302

PDF

https://arxiv.org/pdf/2410.19302.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot