Paper Reading AI Learner

Beyond Relevance: Evaluate and Improve Retrievers on Perspective Awareness

2024-05-04 17:10:00
Xinran Zhao, Tong Chen, Sihao Chen, Hongming Zhang, Tongshuang Wu

Abstract

The task of Information Retrieval (IR) requires a system to identify relevant documents based on users' information needs. In real-world scenarios, retrievers are expected to not only rely on the semantic relevance between the documents and the queries but also recognize the nuanced intents or perspectives behind a user query. For example, when asked to verify a claim, a retrieval system is expected to identify evidence from both supporting vs. contradicting perspectives, for the downstream system to make a fair judgment call. In this work, we study whether retrievers can recognize and respond to different perspectives of the queries -- beyond finding relevant documents for a claim, can retrievers distinguish supporting vs. opposing documents? We reform and extend six existing tasks to create a benchmark for retrieval, where we have diverse perspectives described in free-form text, besides root, neutral queries. We show that current retrievers covered in our experiments have limited awareness of subtly different perspectives in queries and can also be biased toward certain perspectives. Motivated by the observation, we further explore the potential to leverage geometric features of retriever representation space to improve the perspective awareness of retrievers in a zero-shot manner. We demonstrate the efficiency and effectiveness of our projection-based methods on the same set of tasks. Further analysis also shows how perspective awareness improves performance on various downstream tasks, with 4.2% higher accuracy on AmbigQA and 29.9% more correlation with designated viewpoints on essay writing, compared to non-perspective-aware baselines.

Abstract (translated)

信息检索(IR)任务的目的是根据用户的需要识别相关的文档。在现实场景中,检索器不仅应该根据文档和查询之间的语义相关性来查找相关文档,还应该识别用户查询背后的细微意图或观点。例如,当被要求验证一个主张时,检索系统应该从支持者和反驳者的角度来看明证据,以便下游系统做出公正的判断。在这项工作中,我们研究了检索器是否能够识别和响应不同查询的角度 - 不仅限于找到相关文档,还可以区分支持者和反对者的文档吗?我们将现有的六个任务进行改革和扩展,为检索创建了一个基准,其中我们用自由文本描述了多样化的观点。我们发现,我们实验中的现有检索器对查询中的微妙不同角度缺乏意识,并且可能存在偏见。为了激发这种观察,我们进一步研究了利用检索器表示空间的几何特征来以零散的方式改善检索器在零散观点上的视角意识的可能性。我们在同一任务集上展示了我们的投影基方法的有效性和有效性。进一步的分析还表明,视角意识在各种下游任务上的改善,与非视角意识的基线相比,在Am ambigQA上的准确度提高了4.2%,在论文写作上的指定观点上的相关性提高了29.9%。

URL

https://arxiv.org/abs/2405.02714

PDF

https://arxiv.org/pdf/2405.02714.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot