Paper Reading AI Learner

Towards the Scalable Evaluation of Cooperativeness in Language Models

2023-03-16 15:34:23
Alan Chan, Maxime Riché, Jesse Clifton

Abstract

It is likely that AI systems driven by pre-trained language models (PLMs) will increasingly be used to assist humans in high-stakes interactions with other agents, such as negotiation or conflict resolution. Consistent with the goals of Cooperative AI \citep{dafoe_open_2020}, we wish to understand and shape the multi-agent behaviors of PLMs in a pro-social manner. An important first step is the evaluation of model behaviour across diverse cooperation problems. Since desired behaviour in an interaction depends upon precise game-theoretic structure, we focus on generating scenarios with particular structures with both crowdworkers and a language model. Our work proceeds as follows. First, we discuss key methodological issues in the generation of scenarios corresponding to particular game-theoretic structures. Second, we employ both crowdworkers and a language model to generate such scenarios. We find that the quality of generations tends to be mediocre in both cases. We additionally get both crowdworkers and a language model to judge whether given scenarios align with their intended game-theoretic structure, finding mixed results depending on the game. Third, we provide a dataset of scenario based on our data generated. We provide both quantitative and qualitative evaluations of UnifiedQA and GPT-3 on this dataset. We find that instruct-tuned models tend to act in a way that could be perceived as cooperative when scaled up, while other models seemed to have flat scaling trends.

Abstract (translated)

可能的是,基于预训练语言模型(PLMs)驱动的人工智能系统将 increasingly 被用来协助人类与其他agent之间的高级别的交互,例如谈判或冲突解决。与合作人工智能(Cooperative AI)的目标相一致,我们希望理解并塑造PLMs的多方行为,以 pro-social 的方式影响它们。一个重要的步骤是评估不同合作问题的模型行为。由于在交互中期望的行为取决于精确的博弈论结构,我们重点处理生成具有特定结构的情境,同时雇用群众演员和语言模型。我们的工作按照以下步骤进行:首先,我们讨论了生成特定博弈论结构的方法和关键方法论问题。其次,我们使用群众演员和语言模型生成这样的情境。我们发现,在两个情况下,生成的质量都相对较低。我们还让群众演员和语言模型判断给定情境是否与它们的预期的博弈论结构对齐,发现根据游戏结果会出现不同结果。第三,我们提供了基于我们生成的数据的情境数据集。我们在这个数据集中提供了 UnifiedQA 和 GPT-3 的定量和定性评估。我们发现,经过调整的模型往往会在扩大规模时表现出可以被视为合作的方式,而其他模型似乎呈现出平增长趋势。

URL

https://arxiv.org/abs/2303.13360

PDF

https://arxiv.org/pdf/2303.13360.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot