Paper Reading AI Learner

Multilingual Open QA on the MIA Shared Task

2025-01-07 21:43:09
Navya Yarrabelly, Saloni Mittal, Ketan Todi, Kimihiro Hasegawa

Abstract

Cross-lingual information retrieval (CLIR) ~\cite{shi2021cross, asai2021one, jiang2020cross} for example, can find relevant text in any language such as English(high resource) or Telugu (low resource) even when the query is posed in a different, possibly low-resource, language. In this work, we aim to develop useful CLIR models for this constrained, yet important, setting where we do not require any kind of additional supervision or labelled data for retrieval task and hence can work effectively for low-resource languages. \par We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retrieved passages with a zero-shot multilingual question generation model, which is a pre-trained language model, to compute the probability of the input question in the target language conditioned on a retrieved passage, which can be possibly in a different language. We evaluate our method in a completely zero shot setting and doesn't require any training. Thus the main advantage of our method is that our approach can be used to re-rank results obtained by any sparse retrieval methods like BM-25. This eliminates the need for obtaining expensive labelled corpus required for the retrieval tasks and hence can be used for low resource languages.

Abstract (translated)

跨语言信息检索(CLIR)~\cite{shi2021cross, asai2021one, jiang2020cross} 例如,可以在查询使用不同甚至低资源语言的情况下,在任何语言如英语(高资源)或泰卢固语(低资源)中找到相关文本。在本工作中,我们的目标是开发在这种受限但重要的场景下有用的CLIR模型:在这个场景中,我们不需要任何形式的额外监督或标记数据来执行检索任务,因此可以有效地用于低资源语言。 我们提出了一种简单而有效的重新排序方法,以改进开放问题回答中的段落检索。该重排器使用零样本多语言问题生成模型(这是一种预训练的语言模型)为已检索到的段落重新打分,计算在给定检索到的段落后输入问题在目标语言中出现的概率,这些段落可能与查询语言不同。我们在完全无监督的设置下评估我们的方法,并不需要任何训练数据。因此我们方法的主要优势在于它可以用于重新排序由像BM-25这样的稀疏检索方法获得的结果。这消除了为检索任务获取昂贵的标记语料库的需求,从而可以应用于低资源语言。 通过这种方式,该方法提供了一种有效的途径来改进跨语言信息检索,并且能够适用于没有大量标注数据的语言环境。

URL

https://arxiv.org/abs/2501.04153

PDF

https://arxiv.org/pdf/2501.04153.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot