Abstract
Graph Neural Networks (GNNs) demonstrate excellent performance on graphs, with their core idea about aggregating neighborhood information and learning from labels. However, the prevailing challenges in most graph datasets are twofold of Insufficient High-Quality Labels and Lack of Neighborhoods, resulting in weak GNNs. Existing data augmentation methods designed to address these two issues often tackle only one. They may either require extensive training of generators, rely on overly simplistic strategies, or demand substantial prior knowledge, leading to suboptimal generalization abilities. To simultaneously address both of these two challenges, we propose an elegant method called IntraMix. IntraMix innovatively employs Mixup among low-quality labeled data of the same class, generating high-quality labeled data at minimal cost. Additionally, it establishes neighborhoods for the generated data by connecting them with data from the same class with high confidence, thereby enriching the neighborhoods of graphs. IntraMix efficiently tackles both challenges faced by graphs and challenges the prior notion of the limited effectiveness of Mixup in node classification. IntraMix serves as a universal framework that can be readily applied to all GNNs. Extensive experiments demonstrate the effectiveness of IntraMix across various GNNs and datasets.
Abstract (translated)
图神经网络(GNNs)在图中表现出色,其核心思想是聚合邻近信息并从标签中学习。然而,大多数图数据集的普遍挑战是缺乏高质量标签和缺乏邻近关系,导致低效的GNNs。为解决这两个问题,现有数据增强方法通常只解决一个问题。它们可能需要对生成器进行广泛的训练,依赖于过于简单的策略,或者需要相当多的先验知识,导致泛化能力下降。为了同时解决这两个问题,我们提出了一个优雅的方法,称为IntraMix。IntraMix通过在同一类低质量标记数据中使用Mixup创新地解决了这个问题,生成高质量标记数据且成本较低。此外,它通过连接生成的数据和同一类数据建立邻域,从而丰富图的邻域。IntraMix有效地解决了图面临的问题,并挑战了在节点分类中Mixup效果有限的概念。IntraMix成为了一个通用的框架,可以轻松应用于所有GNN。大量实验证明IntraMix在各种GNN和数据集上的有效性。
URL
https://arxiv.org/abs/2405.00957