Paper Reading AI Learner

Complementing GPT-3 with Few-Shot Sequence-to-Sequence Semantic Parsing over Wikidata

2023-05-23 16:20:43
Silei Xu, Theo Culhane, Meng-Hsi Wu, Sina J. Semnani, Monica S. Lam

Abstract

As the largest knowledge base, Wikidata is a massive source of knowledge, complementing large language models with well-structured data. In this paper, we present WikiWebQuestions, a high-quality knowledge base question answering benchmark for Wikidata. This new benchmark uses real-world human data with SPARQL annotation to facilitate a more accurate comparison with large language models utilizing the up-to-date answers from Wikidata. Additionally, a baseline for this benchmark is established with an effective training data synthesis methodology and WikiSP, a Seq2Seq semantic parser, that handles large noisy knowledge graphs. Experimental results illustrate the effectiveness of this methodology, achieving 69% and 59% answer accuracy in the dev set and test set, respectively. We showed that we can pair semantic parsers with GPT-3 to provide a combination of verifiable results and qualified guesses that can provide useful answers to 97% of the questions in the dev set of our benchmark.

Abstract (translated)

作为一家最大的知识库,维基百科是一个巨大的知识来源,与大型语言模型结合,提供了良好的数据结构。在本文中,我们介绍了维基百科Web问题,这是一个高质量的知识库问题回答基准维基百科。这个新基准使用现实世界的人形数据,并使用SPARQL注释,以促进更准确地比较使用维基百科最新答案的大型语言模型。此外,该基准的基础线通过有效的训练数据合成方法和维基百科SP,一个Seq2Seq语义解析器,处理大型噪声知识图。实验结果显示该方法的有效性,在开发集和测试集上分别实现69%和59%的答案准确性。我们表明,我们可以将语义解析器和GPT-3配对,提供可验证的结果和 qualified guesses,为基准开发集上的97%问题提供有用的答案。

URL

https://arxiv.org/abs/2305.14202

PDF

https://arxiv.org/pdf/2305.14202.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot