Paper Reading AI Learner

Multi-view Feature Extraction based on Triple Contrastive Heads

2023-03-22 14:56:51
Hongjie Zhang

Abstract

Multi-view feature extraction is an efficient approach for alleviating the issue of dimensionality in highdimensional multi-view data. Contrastive learning (CL), which is a popular self-supervised learning method, has recently attracted considerable attention. In this study, we propose a novel multi-view feature extraction method based on triple contrastive heads, which combines the sample-, recovery- , and feature-level contrastive losses to extract the sufficient yet minimal subspace discriminative information in compliance with information bottleneck principle. In MFETCH, we construct the feature-level contrastive loss, which removes the redundent information in the consistency information to achieve the minimality of the subspace discriminative information. Moreover, the recovery-level contrastive loss is also constructed in MFETCH, which captures the view-specific discriminative information to achieve the sufficiency of the subspace discriminative information.The numerical experiments demonstrate that the proposed method offers a strong advantage for multi-view feature extraction.

Abstract (translated)

多视图特征提取是一种高效的方法,以解决高维多视图数据中的维度问题。对比学习(CL)是一种常见的自监督学习方法,最近引起了广泛关注。在本研究中,我们提出了基于三对比头的新多视图特征提取方法,该方法将样本、恢复和特征级别的对比损失相结合,以提取符合信息瓶颈原则的足够但最小的子空间相关特征。在MFETCH中,我们建立了特征级别的对比损失,该损失消除了一致性信息中的冗余信息,以实现子空间相关特征的最小化。此外,在MFETCH中,我们还建立了恢复级别的对比损失,该损失捕捉视图特定的相关特征,以实现子空间相关特征的足够。数值实验表明,该方法为多视图特征提取提供了强有力的优势。

URL

https://arxiv.org/abs/2303.12615

PDF

https://arxiv.org/pdf/2303.12615.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot