Abstract
Knowledge graphs have emerged as a sophisticated advancement and refinement of semantic networks, and their deployment is one of the critical methodologies in contemporary artificial intelligence. The construction of knowledge graphs is a multifaceted process involving various techniques, where researchers aim to extract the knowledge from existing resources for the construction since building from scratch entails significant labor and time costs. However, due to the pervasive issue of heterogeneity, the description diversity across different knowledge graphs can lead to mismatches between concepts, thereby impacting the efficacy of knowledge extraction. This Ph.D. study focuses on automatic knowledge graph extension, i.e., properly extending the reference knowledge graph by extracting and integrating concepts from one or more candidate knowledge graphs. We propose a novel knowledge graph extension framework based on entity type recognition. The framework aims to achieve high-quality knowledge extraction by aligning the schemas and entities across different knowledge graphs, thereby enhancing the performance of the extension. This paper elucidates three major contributions: (i) we propose an entity type recognition method exploiting machine learning and property-based similarities to enhance knowledge extraction; (ii) we introduce a set of assessment metrics to validate the quality of the extended knowledge graphs; (iii) we develop a platform for knowledge graph acquisition, management, and extension to benefit knowledge engineers practically. Our evaluation comprehensively demonstrated the feasibility and effectiveness of the proposed extension framework and its functionalities through quantitative experiments and case studies.
Abstract (translated)
知识图谱已成为语义网络的先进发展和精炼,其在当代人工智能中扮演着关键方法论的角色。知识图谱的构建是一个多方面的过程,涉及各种技术,研究人员旨在从现有资源中提取知识,因为从头开始构建会带来大量的人力和时间成本。然而,由于普遍存在的异质性问题,知识图谱之间的描述差异可能导致概念之间的不匹配,从而影响知识提取的效力。本博士学位论文专注于自动知识图谱扩展,即通过提取和整合一个或多个候选知识图谱来扩展参考知识图谱。我们提出了一个基于实体类型识别的知识图谱扩展框架。该框架旨在通过将不同知识图谱之间的模式和实体对齐,实现高质量的知识扩展,从而提高扩展的性能。本文阐明了三个主要贡献: (i)我们提出了一种利用机器学习和属性基于相似性的实体类型识别方法,以增强知识提取; (ii)我们引入了一组评估指标来验证扩展知识图谱的质量; (iii)我们开发了一个知识图谱获取、管理和扩展平台,以帮助知识工程师实际操作。我们的评估全面展示了所提出的扩展框架的可行性和有效性,以及其实用性。
URL
https://arxiv.org/abs/2405.02463