Paper Reading AI Learner

AmbigDocs: Reasoning across Documents on Different Entities under the Same Name

2024-04-18 18:12:01
Yoonsang Lee, Xi Ye, Eunsol Choi

Abstract

Different entities with the same name can be difficult to distinguish. Handling confusing entity mentions is a crucial skill for language models (LMs). For example, given the question "Where was Michael Jordan educated?" and a set of documents discussing different people named Michael Jordan, can LMs distinguish entity mentions to generate a cohesive answer to the question? To test this ability, we introduce a new benchmark, AmbigDocs. By leveraging Wikipedia's disambiguation pages, we identify a set of documents, belonging to different entities who share an ambiguous name. From these documents, we generate questions containing an ambiguous name and their corresponding sets of answers. Our analysis reveals that current state-of-the-art models often yield ambiguous answers or incorrectly merge information belonging to different entities. We establish an ontology categorizing four types of incomplete answers and automatic evaluation metrics to identify such categories. We lay the foundation for future work on reasoning across multiple documents with ambiguous entities.

Abstract (translated)

具有相同名称的不同实体可能很难区分。处理令人困惑的实体提及是语言模型(LMs)的一项关键技能。例如,给定问题“迈克尔·乔丹在哪里受教育?”以及一系列讨论不同名为迈克尔·乔丹的人的文件,LMs能否区分实体提及并生成针对问题的连贯答案?为了测试这种能力,我们引入了一个新的基准,AmbigDocs。通过利用维基百科的歧义页面,我们找到了一组属于不同实体的具有模糊名称的文档。从这些文档中,我们生成包含模糊名称和相关答案的问题。我们的分析显示,当前最先进的模型通常会产生模糊的答案或错误地合并来自不同实体的信息。我们建立了一个分类为四种不完整答案的元数据模型和自动评估指标,以识别这些类别。我们在跨多个具有模糊实体的文档之间进行推理的基础之上,为未来的研究工作奠定了基础。

URL

https://arxiv.org/abs/2404.12447

PDF

https://arxiv.org/pdf/2404.12447.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot