Paper Reading AI Learner

CAFS: Class Adaptive Framework for Semi-Supervised Semantic Segmentation

2023-03-21 05:56:53
Jingi Ju, Hyeoncheol Noh, Yooseung Wang, Minseok Seo, Dong-Geol Choi

Abstract

Semi-supervised semantic segmentation learns a model for classifying pixels into specific classes using a few labeled samples and numerous unlabeled images. The recent leading approach is consistency regularization by selftraining with pseudo-labeling pixels having high confidences for unlabeled images. However, using only highconfidence pixels for self-training may result in losing much of the information in the unlabeled datasets due to poor confidence calibration of modern deep learning networks. In this paper, we propose a class-adaptive semisupervision framework for semi-supervised semantic segmentation (CAFS) to cope with the loss of most information that occurs in existing high-confidence-based pseudolabeling methods. Unlike existing semi-supervised semantic segmentation frameworks, CAFS constructs a validation set on a labeled dataset, to leverage the calibration performance for each class. On this basis, we propose a calibration aware class-wise adaptive thresholding and classwise adaptive oversampling using the analysis results from the validation set. Our proposed CAFS achieves state-ofthe-art performance on the full data partition of the base PASCAL VOC 2012 dataset and on the 1/4 data partition of the Cityscapes dataset with significant margins of 83.0% and 80.4%, respectively. The code is available at this https URL.

Abstract (translated)

半监督语义分割通过少量的标记样本和大量的未标记图像来学习将像素分类到特定类别的模型。最近的领导方法是通过自我训练来保持一致性,同时使用具有未标记图像中高可信度伪标记像素的方法。然而,仅使用高可信度像素进行自我训练可能会在未标记数据集上丢失大部分信息,因为现代深度学习网络的信誉校准较差。在本文中,我们提出了一种按类自适应半监督语义分割框架(CAFS),以应对现有的高可信度伪标记方法中发生的大部分信息丢失。与现有的半监督语义分割框架不同,CAFS在一个标记数据集上构建了一个验证集,以利用每个类别的校准性能。基于验证集的分析结果,我们提出了一种校准意识的分类wise自适应阈值法和分类wise自适应过度采样方法。我们提出的CAFS在PASCAL VOC 2012基础数据集的完整数据分区和城市景观数据集的1/4数据分区中取得了最先进的性能,分别占总数据的83.0%和80.4%。代码在此httpsURL上可用。

URL

https://arxiv.org/abs/2303.11606

PDF

https://arxiv.org/pdf/2303.11606.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot