Paper Reading AI Learner

QAgent: A modular Search Agent with Interactive Query Understanding

2025-10-09 16:08:05
Yi Jiang, Lei Shen, Lujie Niu, Sendong Zhao, Wenbo Su, Bo Zheng

Abstract

Large language models (LLMs) excel at natural language tasks but are limited by their static parametric knowledge, especially in knowledge-intensive task. Retrieval-augmented generation (RAG) mitigates this by integrating external information. However, (1) traditional RAG struggles with complex query understanding, and (2) even search agents trained with reinforcement learning (RL), despite their promise, still face generalization and deployment challenges. To address these limitations, we propose QAgent, a unified agentic RAG framework that employs a search agent for adaptive retrieval. This agent optimizes its understanding of the query through interactive reasoning and retrieval. To facilitate real-world application, we focus on modular search agent for query understanding that are plug-and-play in complex systems. Secifically, the agent follows a multi-step decision process trained with RL to maximize retrieval quality and support accurate downstream answers. We further analyze the strengths and weaknesses of end-to-end RL and propose a strategy that focuses on effective retrieval, thereby enhancing generalization in LLM applications. Experiments show QAgent excels at QA and serves as a plug-and-play module for real-world deployment.

Abstract (translated)

大型语言模型(LLMs)在自然语言处理任务中表现出色,但在知识密集型任务方面受限于其静态参数化知识。检索增强生成(RAG)通过整合外部信息来缓解这一问题。然而,传统RAG在理解复杂查询时仍存在困难,并且即使是经过强化学习(RL)训练的搜索代理,在实际应用中的泛化和部署上也面临挑战。 为了解决这些问题,我们提出了QAgent,一个统一的、基于智能体的RAG框架,采用了一个适应性检索的搜索代理。该代理通过交互式推理和检索来优化其对查询的理解。为了促进其实用性,我们将重点放在模块化的搜索代理上,这些代理可以即插即用地应用于复杂系统中以理解查询。 具体而言,这个代理遵循一个多步骤决策过程,并利用RL进行训练,以此最大化检索质量并支持下游任务的准确答案生成。我们进一步分析了端到端RL的优缺点,并提出了一种策略,侧重于有效的检索,从而增强LLM应用中的泛化能力。实验表明QAgent在问答方面表现出色,并且可以作为即插即用模块用于实际部署中。

URL

https://arxiv.org/abs/2510.08383

PDF

https://arxiv.org/pdf/2510.08383.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot