Abstract
With the frequent happening of privacy leakage and the enactment of privacy laws across different countries, data owners are reluctant to directly share their raw data and labels with any other party. In reality, a lot of these raw data are stored in the graph database, especially for finance. For collaboratively building graph neural networks(GNNs), federated learning(FL) may not be an ideal choice for the vertically partitioned setting where privacy and efficiency are the main concerns. Moreover, almost all the existing federated GNNs are mainly designed for homogeneous graphs, which simplify various types of relations as the same type, thus largely limits their performance. We bridge this gap by proposing a split learning-based GNN(SplitGNN), where this model is divided into two sub-models: the local GNN model includes all the private data related computation to generate local node embeddings, whereas the global model calculates global embeddings by aggregating all the participants' local embeddings. Our SplitGNN allows the isolated heterogeneous neighborhood to be collaboratively utilized. To better capture representations, we propose a novel Heterogeneous Attention(HAT) algorithm and use both node-based and path-based attention mechanisms to learn various types of nodes and edges with multi-hop relation features. We demonstrate the effectiveness of our SplitGNN on node classification tasks for two standard public datasets and the real-world dataset. Extensive experimental results validate that our proposed SplitGNN significantly outperforms the state-of-the-art(SOTA) methods.
Abstract (translated)
由于隐私泄露现象在各个国家的频繁发生,以及各国实施隐私法律,数据所有者不愿意直接向任何第三方分享他们的原始数据和标签。实际上,很多这些数据都存储在 graph 数据库中,特别是金融领域。为了共同构建 graph 神经网络(GNNs),联邦学习(FL)可能不是针对隐私和效率优先的垂直分区场景的理想选择。此外,几乎所有现有的联邦 GNNs 都是针对同质 graph 设计的,这种简化各种类型关系的方式极大地限制了其性能。我们提出了一种基于分裂学习的 GNN(SplitGNN),该模型被分解成两个子模型:本地 GNN 模型包括所有相关的私人数据来计算本地节点嵌入,而全局模型通过聚合所有参与者的本地嵌入来计算全局嵌入。我们的 SplitGNN 允许孤立的异质邻居被共同利用。为了更好地捕捉表示,我们提出了一种新的异质注意力算法(HAT)算法,并使用节点和路径注意力机制来学习具有多级关系特征的各种节点和边。我们证明了我们的 SplitGNN 在两个标准公共数据和真实数据集上的节点分类任务的有效性。广泛的实验结果验证,我们提出的 SplitGNN 显著优于最先进的方法。
URL
https://arxiv.org/abs/2301.12885