mOKB6: A Multilingual Open Knowledge Base Completion Benchmark

2022-11-13 17:10:49

Shubham Mittal, Keshav Kolluru, Soumen Chakrabarti, Mausam

arXiv_CL

Abstract
Abstract (translated)
URL
PDF

Abstract

Automated completion of open knowledge bases (KBs), which are constructed from triples of the form (subject phrase, relation phrase, object phrase) obtained via open information extraction (IE) from text, is useful for discovering novel facts that may not directly be present in the text. However, research in open knowledge base completion (KBC) has so far been limited to resource-rich languages like English. Using the latest advances in multilingual open IE, we construct the first multilingual open KBC dataset, called mOKB6, that contains facts from Wikipedia in six languages (including English). Improving the previous open KB construction pipeline by doing multilingual coreference resolution and keeping only entity-linked triples, we create a dense open KB. We experiment with several baseline models that have been proposed for both open and closed KBs and observe a consistent benefit of using knowledge gained from other languages. The dataset and accompanying code will be made publicly available.

Abstract (translated)

URL

https://arxiv.org/abs/2211.06959

PDF

https://arxiv.org/pdf/2211.06959.pdf