Paper Reading AI Learner

SimUSER: Simulating User Behavior with Large Language Models for Recommender System Evaluation

2025-04-17 07:57:23
Nicolas Bougie, Narimasa Watanabe

Abstract

Recommender systems play a central role in numerous real-life applications, yet evaluating their performance remains a significant challenge due to the gap between offline metrics and online behaviors. Given the scarcity and limits (e.g., privacy issues) of real user data, we introduce SimUSER, an agent framework that serves as believable and cost-effective human proxies. SimUSER first identifies self-consistent personas from historical data, enriching user profiles with unique backgrounds and personalities. Then, central to this evaluation are users equipped with persona, memory, perception, and brain modules, engaging in interactions with the recommender system. SimUSER exhibits closer alignment with genuine humans than prior work, both at micro and macro levels. Additionally, we conduct insightful experiments to explore the effects of thumbnails on click rates, the exposure effect, and the impact of reviews on user engagement. Finally, we refine recommender system parameters based on offline A/B test results, resulting in improved user engagement in the real world.

Abstract (translated)

推荐系统在众多现实应用中扮演着重要角色,但其性能评估仍然面临着挑战,原因是离线指标与用户在线行为之间存在差距。鉴于真实用户数据的稀缺性和限制(例如隐私问题),我们引入了SimUSER这一代理框架,它充当可信赖且成本效益高的虚拟人类代理。SimUSER首先从历史数据中识别出具有连贯性的个人形象,并通过独特的背景和个性来丰富用户画像。然后,在评估过程中,这些拥有个人形象、记忆、感知以及大脑模块的用户会与推荐系统进行互动。 相较于以往的工作,SimUSER在微观和宏观层面上都更接近于真实人类的表现。此外,我们还进行了有洞察力的实验以探索缩略图对点击率的影响、曝光效应以及评论对用户体验的影响。最后,基于离线A/B测试结果,我们优化了推荐系统的参数设置,在实际应用中取得了更高的用户参与度。

URL

https://arxiv.org/abs/2504.12722

PDF

https://arxiv.org/pdf/2504.12722.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot