Abstract
A visual-relational knowledge graph (KG) is a multi-relational graph whose entities are associated with images. We explore novel machine learning approaches for answering visual-relational queries in web-extracted knowledge graphs. To this end, we have created ImageGraph, a KG with 1,330 relation types, 14,870 entities, and 829,931 images crawled from the web. With visual-relational KGs such as ImageGraph one can introduce novel probabilistic query types in which images are treated as first-class citizens. Both the prediction of relations between unseen images as well as multi-relational image retrieval can be expressed with specific families of visual-relational queries. We introduce novel combinations of convolutional networks and knowledge graph embedding methods to answer such queries. We also explore a zero-shot learning scenario where an image of an entirely new entity is linked with multiple relations to entities of an existing KG. The resulting multi-relational grounding of unseen entity images into a knowledge graph serves as a semantic entity representation. We conduct experiments to demonstrate that the proposed methods can answer these visual-relational queries efficiently and accurately.
Abstract (translated)
可视化关系知识图(KG)是一种多关系图,其实体与图像相关联。我们探索了一种新的机器学习方法来回答Web提取知识图中的可视关系查询。为此,我们创建了Imagegraph,一个包含1330个关系类型、14870个实体和829931个从Web上爬行的图像的kg。有了像Imagegraph这样的可视化关系KG,人们可以引入新的概率查询类型,在这种类型中,图像被视为头等公民。既可以预测看不见图像之间的关系,也可以用特定的可视关系查询族表示多关系图像检索。本文介绍了卷积网络和知识图嵌入方法的新组合来回答这些问题。我们还探索了一个零镜头学习场景,其中一个全新实体的图像与一个现有kg实体的多个关系相关联。由此产生的将看不见的实体图像建立在知识图中的多关系基础可以作为语义实体表示。我们进行了实验,证明所提出的方法能够有效、准确地回答这些可视化关系查询。
URL
https://arxiv.org/abs/1709.02314