Paper Reading AI Learner

WordNet-Based Information Retrieval Using Common Hypernyms and Combined Features

2018-07-15 16:49:06
Vuong M. Ngo, Tru H. Cao, Tuan M. V. Le

Abstract

Text search based on lexical matching of keywords is not satisfactory due to polysemous and synonymous words. Semantic search that exploits word meanings, in general, improves search performance. In this paper, we survey WordNet-based information retrieval systems, which employ a word sense disambiguation method to process queries and documents. The problem is that in many cases a word has more than one possible direct sense, and picking only one of them may give a wrong sense for the word. Moreover, the previous systems use only word forms to represent word senses and their hypernyms. We propose a novel approach that uses the most specific common hypernym of the remaining undisambiguated multi-senses of a word, as well as combined WordNet features to represent word meanings. Experiments on a benchmark dataset show that, in terms of the MAP measure, our search engine is 17.7% better than the lexical search, and at least 9.4% better than all surveyed search systems using WordNet. Keywords Ontology, word sense disambiguation, semantic annotation, semantic search.

Abstract (translated)

由于多义词和同义词,基于关键词的词汇匹配的文本搜索不令人满意。通常,利用单词含义的语义搜索可以提高搜索性能。在本文中,我们调查了基于WordNet的信息检索系统,该系统采用词义消歧方法来处理查询和文档。问题在于,在许多情况下,一个单词具有多个可能的直接意义,并且只选择其中一个单词可能会对该单词产生错误的意义。此外,先前的系统仅使用单词形式来表示单词意义及其上位词。我们提出了一种新颖的方法,它使用一个单词的剩余未分配的多义的最具体的常见上位词,以及组合的WordNet特征来表示单词含义。基准数据集上的实验表明,就MAP测量而言,我们的搜索引擎比词汇搜索提高了17.7%,并且比使用WordNet的所有调查搜索系统至少高出9.4%。  关键词本体论,词义消歧,语义标注,语义搜索。

URL

https://arxiv.org/abs/1807.05574

PDF

https://arxiv.org/pdf/1807.05574.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot