Abstract
In the rapidly evolving field of self-supervised learning on graphs, generative and contrastive methodologies have emerged as two dominant approaches. Our study focuses on masked feature reconstruction (MFR), a generative technique where a model learns to restore the raw features of masked nodes in a self-supervised manner. We observe that both MFR and graph contrastive learning (GCL) aim to maximize agreement between similar elements. Building on this observation, we reveal a novel theoretical insight: under specific conditions, the objectives of MFR and node-level GCL converge, despite their distinct operational mechanisms. This theoretical connection suggests these approaches are complementary rather than fundamentally different, prompting us to explore their integration to enhance self-supervised learning on graphs. Our research presents Contrastive Masked Feature Reconstruction (CORE), a novel graph self-supervised learning framework that integrates contrastive learning into MFR. Specifically, we form positive pairs exclusively between the original and reconstructed features of masked nodes, encouraging the encoder to prioritize contextual information over the node's own features. Additionally, we leverage the masked nodes themselves as negative samples, combining MFR's reconstructive power with GCL's discriminative ability to better capture intrinsic graph structures. Empirically, our proposed framework CORE significantly outperforms MFR across node and graph classification tasks, demonstrating state-of-the-art results. In particular, CORE surpasses GraphMAE and GraphMAE2 by up to 2.80% and 3.72% on node classification tasks, and by up to 3.82% and 3.76% on graph classification tasks.
Abstract (translated)
在快速发展的图自监督学习领域,生成式和对比式方法已成为两种主导性策略。我们的研究专注于掩码特征重建(MFR),这是一种通过自我监督方式让模型学会恢复被屏蔽节点原始特性的生成技术。我们注意到,无论是MFR还是图对比学习(GCL)都在力求最大限度地增加相似元素之间的共识度。基于这一观察结果,我们揭示了一个新的理论见解:在特定条件下,尽管两者操作机制不同,但MFR和节点级别的GCL的目标会趋同。这种理论联系表明这些方法是互补的而非根本不同的,这促使我们探索它们融合的可能性以增强图自监督学习的效果。 本研究提出了对比掩码特征重建(CORE),这是一种新的图自监督学习框架,它将对比学习整合进MFR中。具体而言,我们只在原始和被重构的掩码节点特征之间形成正样本对,这鼓励编码器优先利用上下文信息而非节点本身的特性。此外,我们还使用掩码节点自身作为负样本,结合了MFR的重建能力和GCL的鉴别能力来更好地捕捉内在图结构。 从实验结果来看,我们的框架CORE在节点和图分类任务上显著优于MFR,展示了最先进的效果。特别是在节点分类任务中,相比GraphMAE和GraphMAE2,CORE分别提高了高达2.80%和3.72%;而在图分类任务中,则分别提高了高达3.82%和3.76%。
URL
https://arxiv.org/abs/2512.13235