Paper Reading AI Learner

Masked Two-channel Decoupling Framework for Incomplete Multi-view Weak Multi-label Learning

2024-04-26 11:39:50
Chengliang Liu, Jie Wen, Yabo Liu, Chao Huang, Zhihao Wu, Xiaoling Luo, Yong Xu

Abstract

Multi-view learning has become a popular research topic in recent years, but research on the cross-application of classic multi-label classification and multi-view learning is still in its early stages. In this paper, we focus on the complex yet highly realistic task of incomplete multi-view weak multi-label learning and propose a masked two-channel decoupling framework based on deep neural networks to solve this problem. The core innovation of our method lies in decoupling the single-channel view-level representation, which is common in deep multi-view learning methods, into a shared representation and a view-proprietary representation. We also design a cross-channel contrastive loss to enhance the semantic property of the two channels. Additionally, we exploit supervised information to design a label-guided graph regularization loss, helping the extracted embedding features preserve the geometric structure among samples. Inspired by the success of masking mechanisms in image and text analysis, we develop a random fragment masking strategy for vector features to improve the learning ability of encoders. Finally, it is important to emphasize that our model is fully adaptable to arbitrary view and label absences while also performing well on the ideal full data. We have conducted sufficient and convincing experiments to confirm the effectiveness and advancement of our model.

Abstract (translated)

多标签分类和多视角学习近年来已经成为一个热门的研究课题,但跨应用经典多标签分类和多视角学习的研究仍处于早期阶段。在本文中,我们关注具有复杂但高度现实感的 incomplete multi-view weak multi-label learning 问题,并提出了一种基于深度神经网络的遮罩式 two-channel decoupling 框架来解决这个问题。我们方法的核心创新在于将深度多标签学习方法中常见的单通道视图级表示解耦为共享表示和视图特有表示。我们还设计了一个跨通道对比损失项来增强两个通道的语义特征。此外,我们利用监督信息设计了一个标签指导的图正则化损失项,帮助提取的嵌入特征在样本之间保留几何结构。受到图像和文本分析中遮掩机制的成功启发,我们为向量特征开发了一种随机的片段掩码策略,以提高编码器的学习能力。最后,我们需要强调的是,我们的模型对任意视图和标签缺失都具有完全适应性,同时在理想的全数据上表现出色。我们已经进行了充分且有力的实验来证实我们模型的有效性和进步。

URL

https://arxiv.org/abs/2404.17340

PDF

https://arxiv.org/pdf/2404.17340.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot