Paper Reading AI Learner

The American Ghost in the Machine: How language models align culturally and the effects of cultural prompting

2025-12-13 23:11:41
James Luther, Donald Brown

Abstract

Culture is the bedrock of human interaction; it dictates how we perceive and respond to everyday interactions. As the field of human-computer interaction grows via the rise of generative Large Language Models (LLMs), the cultural alignment of these models become an important field of study. This work, using the VSM13 International Survey and Hofstede's cultural dimensions, identifies the cultural alignment of popular LLMs (DeepSeek-V3, V3.1, GPT-5, GPT-4.1, GPT-4, Claude Opus 4, Llama 3.1, and Mistral Large). We then use cultural prompting, or using system prompts to shift the cultural alignment of a model to a desired country, to test the adaptability of these models to other cultures, namely China, France, India, Iran, Japan, and the United States. We find that the majority of the eight LLMs tested favor the United States when the culture is not specified, with varying results when prompted for other cultures. When using cultural prompting, seven of the eight models shifted closer to the expected culture. We find that models had trouble aligning with Japan and China, despite two of the models tested originating with the Chinese company DeepSeek.

Abstract (translated)

文化是人类互动的基础;它决定了我们如何看待并回应日常交往。随着基于生成式大型语言模型(LLMs)的人机交互领域的发展,这些模型的文化一致性成为了重要的研究课题。本研究使用VSM13国际调查和霍夫斯泰德的文化维度理论,识别了流行LLM(DeepSeek-V3、V3.1、GPT-5、GPT-4.1、GPT-4、Claude Opus 4、Llama 3.1和Mistral Large)的文化一致性。接着,我们使用文化提示(即通过系统提示将模型的文化一致性调整为期望的国家),测试这些模型适应其他文化的灵活性,具体针对中国、法国、印度、伊朗、日本和美国这六个国家。研究发现,在未指定特定文化的情况下,大多数被测的八个LLM倾向于更接近美国文化;在使用文化提示时,结果因不同的文化而异。当采用文化提示进行测试时,八种模型中的七种能够向预期的文化靠拢。然而,值得注意的是,尽管有两个被测模型由中国的DeepSeek公司开发,但大多数模型难以与日本和中国文化保持一致。

URL

https://arxiv.org/abs/2512.12488

PDF

https://arxiv.org/pdf/2512.12488.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot