Paper Reading AI Learner

DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life

2024-10-03 17:08:52
Yu Ying Chiu, Liwei Jiang, Yejin Choi

Abstract

As we increasingly seek guidance from LLMs for decision-making in daily life, many of these decisions are not clear-cut and depend significantly on the personal values and ethical standards of the users. We present DailyDilemmas, a dataset of 1,360 moral dilemmas encountered in everyday life. Each dilemma includes two possible actions and with each action, the affected parties and human values invoked. Based on these dilemmas, we consolidated a set of human values across everyday topics e.g., interpersonal relationships, workplace, and environmental issues. We evaluated LLMs on these dilemmas to determine what action they will take and the values represented by these actions. Then, we analyzed these values through the lens of five popular theories inspired by sociology, psychology and philosophy. These theories are: World Value Survey, Moral Foundation Theory, Maslow's Hierarchy of Needs, Aristotle's Virtues, and Plutchik Wheel of Emotion. We find that LLMs are most aligned with the self-expression over survival values in terms of World Value Survey, care over loyalty in Moral Foundation Theory. Interestingly, we find large preferences differences in models for some core values such as truthfulness e.g., Mixtral-8x7B model tends to neglect it by 9.7% while GPT-4-turbo model tends to select it by 9.4%. We also study the recent guidance released by OpenAI (ModelSpec), and Anthropic (Constitutional AI) to understand how their released principles reflect their actual value prioritization when facing nuanced moral reasoning in daily-life settings. We find that end users cannot effectively steer such prioritization using system prompts.

Abstract (translated)

随着我们在日常生活中越来越多地寻求LLM在决策中的指导,许多这些决策并不是非黑即白的,并且取决于用户的个人价值观和道德准则。我们提出了DailyDilemmas数据集,这是一个包含1360个在日常生活中的道德困境的数据集。每个困境都包括两种可能的行动,并且每种行动都涉及到受到影响的各方和 invoked的人类价值观。基于这些困境,我们在日常生活中话题上汇总了人类价值观,例如人际关系、工作和环境问题。我们对LLM在这些困境上的行动进行了评估,以确定他们将采取的行动以及这些行动所代表的人类价值观。然后,我们通过社会、心理学和哲学五个影响较大的理论对这些价值观进行分析。这些理论是:世界价值观调查、道德基础理论、马斯洛需求层次理论、亚里士多德美德理论和情感 wheel 理论。我们发现,LLM在关于自我表达生存价值观方面与自我表达和生存价值观最为相似,在道德基础理论方面与关心忠诚方面最为相似。有趣的是,我们在一些核心价值上发现了很大的偏好差异,例如真理fulness,例如Mixtral-8x7B模型往往忽视了它,而GPT-4-turbo模型往往选择了它。我们还研究了OpenAI(ModelSpec)和Anthropic(宪法AI)最近发布的指导,以了解他们在面对复杂道德推理的日常生活中环境中的实际价值优先级。我们发现,用户无法有效地使用系统提示来引导这种优先级。

URL

https://arxiv.org/abs/2410.02683

PDF

https://arxiv.org/pdf/2410.02683.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot