Paper Reading AI Learner

MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset

2023-03-22 17:08:31
Chen Feng, Ioannis Patras

Abstract

Deep learning has achieved great success in recent years with the aid of advanced neural network structures and large-scale human-annotated datasets. However, it is often costly and difficult to accurately and efficiently annotate large-scale datasets, especially for some specialized domains where fine-grained labels are required. In this setting, coarse labels are much easier to acquire as they do not require expert knowledge. In this work, we propose a contrastive learning method, called $\textbf{Mask}$ed $\textbf{Con}$trastive learning~($\textbf{MaskCon}$) to address the under-explored problem setting, where we learn with a coarse-labelled dataset in order to address a finer labelling problem. More specifically, within the contrastive learning framework, for each sample our method generates soft-labels with the aid of coarse labels against other samples and another augmented view of the sample in question. By contrast to self-supervised contrastive learning where only the sample's augmentations are considered hard positives, and in supervised contrastive learning where only samples with the same coarse labels are considered hard positives, we propose soft labels based on sample distances, that are masked by the coarse labels. This allows us to utilize both inter-sample relations and coarse labels. We demonstrate that our method can obtain as special cases many existing state-of-the-art works and that it provides tighter bounds on the generalization error. Experimentally, our method achieves significant improvement over the current state-of-the-art in various datasets, including CIFAR10, CIFAR100, ImageNet-1K, Standford Online Products and Stanford Cars196 datasets. Code and annotations are available at this https URL.

Abstract (translated)

深度学习在近年来凭借先进的神经网络结构和大量的人类标注数据取得了巨大的成功。然而,准确和高效地标注大规模数据往往成本较高且困难,特别是对于需要高精度标签的某些特定领域。在这种情况下,粗粒度标签更容易获得,因为它们不需要专业知识。在这项工作中,我们提出了一种对比学习方法,称为“Masked Contrastive Learning”($\textbf{MaskCon}$),以解决未被研究的问题解决设定,我们使用粗粒度标注的数据集来学习,以解决更细的标注问题。更具体地说,在对比学习框架内,为每个样本使用粗粒度标签与其他样本对比并生成软标签,同时使用其他样本的增强视图。与自监督对比学习相比,我们提出的是监督对比学习,其中只有样本的增强被视为硬阳性,而监督对比学习则只有使用相同的粗标签的样本被视为硬阳性。我们提出的软标签基于样本距离,被粗标签掩盖。这允许我们利用它们之间的交互关系和使用粗标签。我们证明,我们的方法可以作为特殊案例获取许多现有的先进技术工作,并提供了更紧密的泛化误差边界。实验中,我们的方法在包括CIFAR10、CIFAR100、ImageNet-1K、 Standford Online Products和Stanford cars196等多个数据集上实现了与当前先进技术相比的重大改进。代码和注释可在该https URL上获取。

URL

https://arxiv.org/abs/2303.12756

PDF

https://arxiv.org/pdf/2303.12756.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot