Paper Reading AI Learner

R4: Reinforced Retriever-Reorder-Responder for Retrieval-Augmented Large Language Models

2024-05-04 12:59:10
Taolin Zhang, Dongyang Li, Qizhou Chen, Chengyu Wang, Longtao Huang, Hui Xue, Xiaofeng He, Jun Huang

Abstract

Retrieval-augmented large language models (LLMs) leverage relevant content retrieved by information retrieval systems to generate correct responses, aiming to alleviate the hallucination problem. However, existing retriever-responder methods typically append relevant documents to the prompt of LLMs to perform text generation tasks without considering the interaction of fine-grained structural semantics between the retrieved documents and the LLMs. This issue is particularly important for accurate response generation as LLMs tend to ``lose in the middle'' when dealing with input prompts augmented with lengthy documents. In this work, we propose a new pipeline named ``Reinforced Retriever-Reorder-Responder'' (R$^4$) to learn document orderings for retrieval-augmented LLMs, thereby further enhancing their generation abilities while the large numbers of parameters of LLMs remain frozen. The reordering learning process is divided into two steps according to the quality of the generated responses: document order adjustment and document representation enhancement. Specifically, document order adjustment aims to organize retrieved document orderings into beginning, middle, and end positions based on graph attention learning, which maximizes the reinforced reward of response quality. Document representation enhancement further refines the representations of retrieved documents for responses of poor quality via document-level gradient adversarial learning. Extensive experiments demonstrate that our proposed pipeline achieves better factual question-answering performance on knowledge-intensive tasks compared to strong baselines across various public datasets. The source codes and trained models will be released upon paper acceptance.

Abstract (translated)

检索增强的大型语言模型(LLMs)利用信息检索系统检索的相关内容来生成正确的答案,旨在减轻混杂问题。然而,现有的检索响应方法通常在LLM的提示中附加相关文档进行文本生成任务,而没有考虑检索到的文档与LLM之间细粒度语义结构的交互。这个问题在准确回答问题方面尤为重要,因为LLM在处理带有长文档的输入提示时往往会出现“在中途迷失”的情况。在本文中,我们提出了一个名为“强化检索-排序-回答者”(R$^4$)的新管道来学习检索增强LLM的文档顺序,从而在保持LLM的大参数的同时进一步增强其生成能力。排序学习过程根据生成的响应质量分为两个步骤:文档顺序调整和文档表示增强。具体来说,文档顺序调整旨在根据图注意力学习将检索到的文档顺序组织为开始、中间和结束位置,从而最大化响应质量的强化奖励。文档表示增强通过文档级的梯度 adversarial 学习进一步优化了用于低质量响应的文档表示。大量实验证明,与各种公共数据集上的强大基线相比,我们提出的管道在知识密集型任务上的事实问题回答表现更好。源代码和训练好的模型将在论文接受后发布。

URL

https://arxiv.org/abs/2405.02659

PDF

https://arxiv.org/pdf/2405.02659.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot