Paper Reading AI Learner

When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively

2024-04-30 16:52:55
Tiziano Labruna, Jon Ander Campos, Gorka Azkune

Abstract

In this paper, we demonstrate how Large Language Models (LLMs) can effectively learn to use an off-the-shelf information retrieval (IR) system specifically when additional context is required to answer a given question. Given the performance of IR systems, the optimal strategy for question answering does not always entail external information retrieval; rather, it often involves leveraging the parametric memory of the LLM itself. Prior research has identified this phenomenon in the PopQA dataset, wherein the most popular questions are effectively addressed using the LLM's parametric memory, while less popular ones require IR system usage. Following this, we propose a tailored training approach for LLMs, leveraging existing open-domain question answering datasets. Here, LLMs are trained to generate a special token, <RET>, when they do not know the answer to a question. Our evaluation of the Adaptive Retrieval LLM (Adapt-LLM) on the PopQA dataset showcases improvements over the same LLM under three configurations: (i) retrieving information for all the questions, (ii) using always the parametric memory of the LLM, and (iii) using a popularity threshold to decide when to use a retriever. Through our analysis, we demonstrate that Adapt-LLM is able to generate the <RET> token when it determines that it does not know how to answer a question, indicating the need for IR, while it achieves notably high accuracy levels when it chooses to rely only on its parametric memory.

Abstract (translated)

在本文中,我们证明了大型语言模型(LLMs)在需要额外上下文来回答给定问题时,可以有效地学习使用标准的信息检索(IR)系统。考虑到IR系统的性能,问题回答的最佳策略并不总是涉及外部信息检索,而是通常利用LLM本身的参数化记忆。之前的研究已经发现了这个现象在PopQA数据集中,其中最流行的问题有效地使用LLM的参数化记忆来回答,而较不流行的问题则需要使用IR系统。接着,我们为LLMs提出了一个针对现有开放领域问题回答数据集的定制化训练方法。在这里,LLMs在不知道答案时生成一个特殊标记<RET>。我们对PopQA数据集上的自适应检索LLM(Adapt-LLM)的评估展示了在三种配置下的改进:(i)检索所有问题,(ii)始终使用LLM的参数化记忆,(iii)根据流行度阈值来决定何时使用检索器。通过我们的分析,我们证明了Adapt-LLM能够生成<RET>标记,当它确定自己无法回答问题时,表明需要IR,而当它仅依赖参数化记忆时,其准确率显著提高。

URL

https://arxiv.org/abs/2404.19705

PDF

https://arxiv.org/pdf/2404.19705.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot