Abstract
Multi-view learning has become a popular research topic in recent years, but research on the cross-application of classic multi-label classification and multi-view learning is still in its early stages. In this paper, we focus on the complex yet highly realistic task of incomplete multi-view weak multi-label learning and propose a masked two-channel decoupling framework based on deep neural networks to solve this problem. The core innovation of our method lies in decoupling the single-channel view-level representation, which is common in deep multi-view learning methods, into a shared representation and a view-proprietary representation. We also design a cross-channel contrastive loss to enhance the semantic property of the two channels. Additionally, we exploit supervised information to design a label-guided graph regularization loss, helping the extracted embedding features preserve the geometric structure among samples. Inspired by the success of masking mechanisms in image and text analysis, we develop a random fragment masking strategy for vector features to improve the learning ability of encoders. Finally, it is important to emphasize that our model is fully adaptable to arbitrary view and label absences while also performing well on the ideal full data. We have conducted sufficient and convincing experiments to confirm the effectiveness and advancement of our model.
Abstract (translated)
多标签分类和多视角学习近年来已经成为一个热门的研究课题,但跨应用经典多标签分类和多视角学习的研究仍处于早期阶段。在本文中,我们关注具有复杂但高度现实感的 incomplete multi-view weak multi-label learning 问题,并提出了一种基于深度神经网络的遮罩式 two-channel decoupling 框架来解决这个问题。我们方法的核心创新在于将深度多标签学习方法中常见的单通道视图级表示解耦为共享表示和视图特有表示。我们还设计了一个跨通道对比损失项来增强两个通道的语义特征。此外,我们利用监督信息设计了一个标签指导的图正则化损失项,帮助提取的嵌入特征在样本之间保留几何结构。受到图像和文本分析中遮掩机制的成功启发,我们为向量特征开发了一种随机的片段掩码策略,以提高编码器的学习能力。最后,我们需要强调的是,我们的模型对任意视图和标签缺失都具有完全适应性,同时在理想的全数据上表现出色。我们已经进行了充分且有力的实验来证实我们模型的有效性和进步。
URL
https://arxiv.org/abs/2404.17340