Abstract
While graph kernels (GKs) are easy to train and enjoy provable theoretical guarantees, their practical performances are limited by their expressive power, as the kernel function often depends on hand-crafted combinatorial features of graphs. Compared to graph kernels, graph neural networks (GNNs) usually achieve better practical performance, as GNNs use multi-layer architectures and non-linear activation functions to extract high-order information of graphs as features. However, due to the large number of hyper-parameters and the non-convex nature of the training procedure, GNNs are harder to train. Theoretical guarantees of GNNs are also not well-understood. Furthermore, the expressive power of GNNs scales with the number of parameters, and thus it is hard to exploit the full power of GNNs when computing resources are limited. The current paper presents a new class of graph kernels, Graph Neural Tangent Kernels (GNTKs), which correspond to \emph{infinitely wide} multi-layer GNNs trained by gradient descent. GNTKs enjoy the full expressive power of GNNs and inherit advantages of GKs. Theoretically, we show GNTKs provably learn a class of smooth functions on graphs. Empirically, we test GNTKs on graph classification datasets and show they achieve strong performance.
Abstract (translated)
虽然图核(GKS)易于训练并享有可证明的理论保证,但它们的表现力限制了它们的实际性能,因为图核函数通常依赖于手工制作的组合特征。与图核相比,图神经网络(GNN)通常具有更好的实际性能,因为GNN使用多层结构和非线性激活函数来提取图的高阶信息作为特征。然而,由于超参数的大量存在和训练过程的非凸性,GNN训练难度较大。对GNN的理论保证也不太了解。此外,GNN的表达能力随参数的数量而变化,因此在计算资源有限的情况下,很难充分利用GNN的全部能力。本文提出了一类新的图核,即图神经切核,它对应于梯度下降训练的无限宽多层图核。GNTK充分发挥了GNN的表现力,继承了GKS的优势。理论上,我们证明了gntks在图上学习了一类光滑函数。根据经验,我们在图分类数据集上测试了GNTK,结果表明它们具有很强的性能。
URL
https://arxiv.org/abs/1905.13192