Paper Reading AI Learner

Continual Human Pose Estimation for Incremental Integration of Keypoints and Pose Variations

2024-09-30 16:29:30
Muhammad Saif Ullah Khan, Muhammad Ahmed Ullah Khan, Muhammad Zeshan Afzal, Didier Stricker

Abstract

This paper reformulates cross-dataset human pose estimation as a continual learning task, aiming to integrate new keypoints and pose variations into existing models without losing accuracy on previously learned datasets. We benchmark this formulation against established regularization-based methods for mitigating catastrophic forgetting, including EWC, LFL, and LwF. Moreover, we propose a novel regularization method called Importance-Weighted Distillation (IWD), which enhances conventional LwF by introducing a layer-wise distillation penalty and dynamic temperature adjustment based on layer importance for previously learned knowledge. This allows for a controlled adaptation to new tasks that respects the stability-plasticity balance critical in continual learning. Through extensive experiments across three datasets, we demonstrate that our approach outperforms existing regularization-based continual learning strategies. IWD shows an average improvement of 3.60\% over the state-of-the-art LwF method. The results highlight the potential of our method to serve as a robust framework for real-world applications where models must evolve with new data without forgetting past knowledge.

Abstract (translated)

本文将跨数据集人姿态估计重新建模为一个持续学习任务,旨在将新的关键点和姿态变化集成到现有的模型中,同时不损失之前学习数据的精度。我们通过基准这个公式 against 已经建立的基于正则化的方法来减轻灾难性遗忘,包括 EWC、LFL 和 LwF。此外,我们提出了一种名为 Importance-Weighted Distillation(IWD)的新正则化方法,它通过引入层间蒸馏惩罚和基于层重要性的动态温度调整来增强传统的 LwF。这使得在连续学习过程中对新技术的调整具有可控制性,并尊重连续学习中的稳定性-可塑性平衡。通过三个数据集的广泛实验,我们证明了我们的方法超越了现有的基于正则化的连续学习策略。IWD 显示了与最先进的 LwF 方法相比平均提高了 3.60%。结果强调了我们的方法在真实应用场景中作为具有良好鲁棒性的框架具有潜力。

URL

https://arxiv.org/abs/2409.20469

PDF

https://arxiv.org/pdf/2409.20469.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot