Paper Reading AI Learner

On Memorization of Large Language Models in Logical Reasoning

2024-10-30 15:31:54
Chulin Xie, Yangsibo Huang, Chiyuan Zhang, Da Yu, Xinyun Chen, Bill Yuchen Lin, Bo Li, Badih Ghazi, Ravi Kumar

Abstract

Large language models (LLMs) achieve good performance on challenging reasoning benchmarks, yet could also make basic reasoning mistakes. This contrasting behavior is puzzling when it comes to understanding the mechanisms behind LLMs' reasoning capabilities. One hypothesis is that the increasingly high and nearly saturated performance on common reasoning benchmarks could be due to the memorization of similar problems. In this paper, we systematically investigate this hypothesis with a quantitative measurement of memorization in reasoning tasks, using a dynamically generated logical reasoning benchmark based on Knights and Knaves (K&K) puzzles. We found that LLMs could interpolate the training puzzles (achieving near-perfect accuracy) after fine-tuning, yet fail when those puzzles are slightly perturbed, suggesting that the models heavily rely on memorization to solve those training puzzles. On the other hand, we show that while fine-tuning leads to heavy memorization, it also consistently improves generalization performance. In-depth analyses with perturbation tests, cross difficulty-level transferability, probing model internals, and fine-tuning with wrong answers suggest that the LLMs learn to reason on K&K puzzles despite training data memorization. This phenomenon indicates that LLMs exhibit a complex interplay between memorization and genuine reasoning abilities. Finally, our analysis with per-sample memorization score sheds light on how LLMs switch between reasoning and memorization in solving logical puzzles. Our code and data are available at this https URL.

Abstract (translated)

大型语言模型(LLMs)在具有挑战性的推理基准测试中表现出良好的性能,但也可能犯一些基本的推理错误。这种矛盾的行为让人困惑,尤其是在理解LLMs推理能力背后的机制时。一种假设是,它们在常见推理基准上表现高且几乎饱和可能是由于对类似问题的记忆所致。在这篇论文中,我们系统地调查了这一假设,通过基于骑士与骗子(Knights and Knaves, 简称K&K)谜题的动态生成逻辑推理基准进行定量测量记忆情况。研究发现,经过微调后,LLMs能够插值训练谜题(实现近乎完美的准确性),但当这些谜题稍微扰动时却失败了,表明模型在解决训练谜题时严重依赖于记忆。另一方面,我们展示了虽然微调导致了严重的记忆现象,但它也持续提高了泛化性能。通过扰动测试、跨难度水平的可迁移性分析、探测模型内部结构以及使用错误答案进行微调等深入分析表明,尽管存在训练数据的记忆问题,LLMs还是学习了如何解决K&K谜题中的推理。这一现象表明,LLMs展示了记忆与真实推理能力之间的复杂相互作用。最后,我们通过每个样本的记忆得分分析揭示了LLMs在解决逻辑谜题时如何在推理和记忆之间转换。我们的代码和数据可以在这个链接(https URL)上获得。

URL

https://arxiv.org/abs/2410.23123

PDF

https://arxiv.org/pdf/2410.23123.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot