Paper Reading AI Learner

Balanced Residual Distillation Learning for 3D Point Cloud Class-Incremental Semantic Segmentation

2024-08-02 16:09:06
Yuanzhi Su, Siyuan Chen, Yuan-Gen Wang

Abstract

Class-incremental learning (CIL) thrives due to its success in processing the influx of information by learning from continuously added new classes while preventing catastrophic forgetting about the old ones. It is essential for the performance breakthrough of CIL to effectively refine past knowledge from the base model and balance it with new learning. However, such an issue has not yet been considered in current research. In this work, we explore the potential of CIL from these perspectives and propose a novel balanced residual distillation framework (BRD-CIL) to push the performance bar of CIL to a new higher level. Specifically, BRD-CIL designs a residual distillation learning strategy, which can dynamically expand the network structure to capture the residuals between the base and target models, effectively refining the past knowledge. Furthermore, BRD-CIL designs a balanced pseudo-label learning strategy by generating a guidance mask to reduce the preference for old classes, ensuring balanced learning from new and old classes. We apply the proposed BRD-CIL to a challenging 3D point cloud semantic segmentation task where the data are unordered and unstructured. Extensive experimental results demonstrate that BRD-CIL sets a new benchmark with an outstanding balance capability in class-biased scenarios.

Abstract (translated)

分类级学习(CIL)之所以能够茁壮发展,是因为其在处理大量新类别的信息的同时,通过连续添加新类别来避免灾难性遗忘关于旧知识。对于CIL的性能突破,有效地从基础模型中精炼过去的知识并将其与新的学习相结合至关重要。然而,在当前的研究中,这个问题尚未被考虑。在这篇工作中,我们从这些角度探讨了CIL的潜力,并提出了一种新颖的平衡残差蒸馏框架(BRD-CIL),以将CIL的性能推向更高的层次。 具体来说,BRD-CIL设计了一个残差蒸馏学习策略,可以动态地扩展网络结构,捕捉基础模型和目标模型之间的残差,有效精炼过去的知识。此外,BRD-CIL还设计了一个平衡伪标签学习策略,通过生成指导掩码来减少对旧类别的偏好,确保从新旧类别之间实现平衡学习。我们将所提出的BRD-CIL应用于一个具有挑战性的3D点云语义分割任务,其中数据无序且无结构。大量的实验结果表明,BRD-CIL在分类偏见的场景中设置了一个新的基准,具有出色的平衡能力。

URL

https://arxiv.org/abs/2408.01356

PDF

https://arxiv.org/pdf/2408.01356.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot