Abstract
In this work, we present an end-to-end Knowledge Graph Question Answering (KGQA) system named GETT-QA. GETT-QA uses T5, a popular text-to-text pre-trained language model. The model takes a question in natural language as input and produces a simpler form of the intended SPARQL query. In the simpler form, the model does not directly produce entity and relation IDs. Instead, it produces corresponding entity and relation labels. The labels are grounded to KG entity and relation IDs in a subsequent step. To further improve the results, we instruct the model to produce a truncated version of the KG embedding for each entity. The truncated KG embedding enables a finer search for disambiguation purposes. We find that T5 is able to learn the truncated KG embeddings without any change of loss function, improving KGQA performance. As a result, we report strong results for LC-QuAD 2.0 and SimpleQuestions-Wikidata datasets on end-to-end KGQA over Wikidata.
Abstract (translated)
在本研究中,我们提出了一个端到端的知识图问答系统,名为GETT-QA。GETT-QA使用了一个流行的文本到文本预训练语言模型T5。该模型以自然语言问题作为输入,并生成简化版的SPARQL查询。在简化版中,模型并不直接生成实体和关系ID。相反,它生成相应的实体和关系标签。在后续步骤中,标签被连接到知识实体和关系ID。为了进一步改善结果,我们要求模型为每个实体生成一份知识实体嵌入的截断版本。截断知识实体嵌入为实现更细的歧义查找而提供了便利。我们发现,T5能够无需改变损失函数而学习截断知识实体嵌入,从而提高了KGQA性能。因此,我们报告了LC-QuAD 2.0和SimpleQuestions-Wikidata datasets在Wikidata上端到端KGQA方面的出色结果。
URL
https://arxiv.org/abs/2303.13284