Abstract
Knowledge graph representation learning (KGRL) or knowledge graph embedding (KGE) plays a crucial role in AI applications for knowledge construction and information exploration. These models aim to encode entities and relations present in a knowledge graph into a lower-dimensional vector space. During the training process of KGE models, using positive and negative samples becomes essential for discrimination purposes. However, obtaining negative samples directly from existing knowledge graphs poses a challenge, emphasizing the need for effective generation techniques. The quality of these negative samples greatly impacts the accuracy of the learned embeddings, making their generation a critical aspect of KGRL. This comprehensive survey paper systematically reviews various negative sampling (NS) methods and their contributions to the success of KGRL. Their respective advantages and disadvantages are outlined by categorizing existing NS methods into five distinct categories. Moreover, this survey identifies open research questions that serve as potential directions for future investigations. By offering a generalization and alignment of fundamental NS concepts, this survey provides valuable insights for designing effective NS methods in the context of KGRL and serves as a motivating force for further advancements in the field.
Abstract (translated)
知识图谱表示学习(KGRL)或知识图嵌入(KGE)在知识图谱建设和信息探索AI应用中起着关键作用。这些模型旨在将知识图谱中存在的实体和关系编码为低维向量空间。在KGE模型的训练过程中,使用正向和负样本对区分 purposes变得至关重要。然而,从现有知识图中直接获取负样本存在挑战,这强调了需要有效的生成技术的重要性。这些负样本的质量对所获得嵌入的准确性有很大影响,使得其生成成为KGRL的关键方面。 本全面调查论文系统地回顾了各种负采样(NS)方法及其对KGRL成功的贡献。根据现有NS方法的分类,分别描述了它们的优缺点。此外,本调查还识别出潜在的研究问题,这些问题可能成为未来研究的方向。通过提供对基本NS概念的泛化和对齐,本调查为在KGRL背景下设计有效的NS方法提供了宝贵的洞见,成为该领域进一步发展的推动力。
URL
https://arxiv.org/abs/2402.19195