Paper Reading AI Learner

All Points Matter: Entropy-Regularized Distribution Alignment for Weakly-supervised 3D Segmentation

2023-05-25 08:19:31
Liyao Tang, Zhe Chen, Shanshan Zhao, Chaoyue Wang, Dacheng Tao

Abstract

Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning. Existing methods often rely on empirical label selection strategies, such as confidence thresholding, to generate beneficial pseudo-labels for model training. This approach may, however, hinder the comprehensive exploitation of unlabeled data points. We hypothesize that this selective usage arises from the noise in pseudo-labels generated on unlabeled data. The noise in pseudo-labels may result in significant discrepancies between pseudo-labels and model predictions, thus confusing and affecting the model training greatly. To address this issue, we propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions. More specifically, our method introduces an Entropy Regularization loss and a Distribution Alignment loss for weakly supervised learning in 3D segmentation tasks, resulting in an ERDA learning strategy. Interestingly, by using KL distance to formulate the distribution alignment loss, it reduces to a deceptively simple cross-entropy-based loss which optimizes both the pseudo-label generation network and the 3D segmentation network simultaneously. Despite the simplicity, our method promisingly improves the performance. We validate the effectiveness through extensive experiments on various baselines and large-scale datasets. Results show that ERDA effectively enables the effective usage of all unlabeled data points for learning and achieves state-of-the-art performance under different settings. Remarkably, our method can outperform fully-supervised baselines using only 1% of true annotations. Code and model will be made publicly available.

Abstract (translated)

在弱监督的三维分割任务中,大量的 ground-truth 标签只可用来学习,而训练数据集非常稀疏。现有的方法往往依赖于经验标签选择策略,如置信阈值,来生成有益的伪标签用于模型训练。然而,这种方法可能会妨碍全面利用未标注数据点。我们假设这种选择是源于伪标签在未标注数据上的噪声。伪标签上的噪声可能导致伪标签和模型预测之间的显著差异,因此会混淆和影响模型训练。为了解决这一问题,我们提出了一种新的学习策略, regularize 生成的伪标签,并有效地缩小伪标签和模型预测之间的差距。具体来说,我们的方法引入了熵Regularization Loss和分布对齐 Loss,以弱监督的三维分割任务为例,生成 ERDA 学习策略。有趣的是,通过使用KL距离来制定分布对齐 Loss,它简化成一个看似简单的交叉熵基函数 loss,同时优化伪标签生成网络和三维分割网络。尽管简单,我们的方法却显著提高了性能。我们通过广泛的实验,对各种不同的基准值和大型数据集进行了验证。结果显示,ERDA 有效地使所有未标注数据点的学习有效利用,并在不同设置下实现最先进的性能。值得注意的是,我们的方法只需使用1%的真实标注数据就能超越完全监督基准线。代码和模型将公开可用。

URL

https://arxiv.org/abs/2305.15832

PDF

https://arxiv.org/pdf/2305.15832.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot