Paper Reading AI Learner

Retrieval Head Mechanistically Explains Long-Context Factuality

2024-04-24 00:24:03
Wenhao Wu, Yizhong Wang, Guangxuan Xiao, Hao Peng, Yao Fu

Abstract

Despite the recent progress in long-context language models, it remains elusive how transformer-based models exhibit the capability to retrieve relevant information from arbitrary locations within the long context. This paper aims to address this question. Our systematic investigation across a wide spectrum of models reveals that a special type of attention heads are largely responsible for retrieving information, which we dub retrieval heads. We identify intriguing properties of retrieval heads:(1) universal: all the explored models with long-context capability have a set of retrieval heads; (2) sparse: only a small portion (less than 5\%) of the attention heads are retrieval. (3) intrinsic: retrieval heads already exist in models pretrained with short context. When extending the context length by continual pretraining, it is still the same set of heads that perform information retrieval. (4) dynamically activated: take Llama-2 7B for example, 12 retrieval heads always attend to the required information no matter how the context is changed. The rest of the retrieval heads are activated in different contexts. (5) causal: completely pruning retrieval heads leads to failure in retrieving relevant information and results in hallucination, while pruning random non-retrieval heads does not affect the model's retrieval ability. We further show that retrieval heads strongly influence chain-of-thought (CoT) reasoning, where the model needs to frequently refer back the question and previously-generated context. Conversely, tasks where the model directly generates the answer using its intrinsic knowledge are less impacted by masking out retrieval heads. These observations collectively explain which internal part of the model seeks information from the input tokens. We believe our insights will foster future research on reducing hallucination, improving reasoning, and compressing the KV cache.

Abstract (translated)

尽管在长上下文语言模型方面已经取得了最近的进展,但如何让基于Transformer的模型在长上下文中检索到相关信息仍然是一个难以解决的问题。本文旨在回答这个问题。我们对一系列模型进行系统性的调查,揭示了检索头部的特殊性质。我们称之为检索头部。我们确定了检索头部的有趣特性:(1)普遍:所有具有长上下文能力的模型都有检索头部;(2)稀疏:只有不到5%的注意力头部是检索头部。(3)内在:预训练模型中已经存在检索头部。在通过持续预训练扩展上下文长度时,仍然是相同的检索头部执行信息检索。(4)动态激活:以Llama-2 7B为例,即使上下文变化,12个检索头部始终关注所需信息。其余的检索头部则在不同的上下文中激活。 (5)因果:完全删除检索头部会导致无法检索到相关信息,并导致幻觉,而删除随机非检索头部则不会影响模型的检索能力。我们进一步表明,检索头部强烈影响链式推理(CoT)推理,即模型需要经常回顾问题及其先前的上下文。相反,使用模型自身的知识直接生成答案的任务对遮盖检索头部的影响较小。这些观察结果共同解释了模型从输入词中寻求信息的部分。我们相信,我们的见解将促进未来研究在减少幻觉、提高推理和压缩KV缓存方面取得进一步进展。

URL

https://arxiv.org/abs/2404.15574

PDF

https://arxiv.org/pdf/2404.15574.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot