Paper Reading AI Learner

Enhancing Action Recognition from Low-Quality Skeleton Data via Part-Level Knowledge Distillation

2024-04-28 14:58:54
Cuiwei Liu, Youzhi Jiang, Chong Du, Zhaokui Li


Skeleton-based action recognition is vital for comprehending human-centric videos and has applications in diverse domains. One of the challenges of skeleton-based action recognition is dealing with low-quality data, such as skeletons that have missing or inaccurate joints. This paper addresses the issue of enhancing action recognition using low-quality skeletons through a general knowledge distillation framework. The proposed framework employs a teacher-student model setup, where a teacher model trained on high-quality skeletons guides the learning of a student model that handles low-quality skeletons. To bridge the gap between heterogeneous high-quality and lowquality skeletons, we present a novel part-based skeleton matching strategy, which exploits shared body parts to facilitate local action pattern learning. An action-specific part matrix is developed to emphasize critical parts for different actions, enabling the student model to distill discriminative part-level knowledge. A novel part-level multi-sample contrastive loss achieves knowledge transfer from multiple high-quality skeletons to low-quality ones, which enables the proposed knowledge distillation framework to include training low-quality skeletons that lack corresponding high-quality matches. Comprehensive experiments conducted on the NTU-RGB+D, Penn Action, and SYSU 3D HOI datasets demonstrate the effectiveness of the proposed knowledge distillation framework.

Abstract (translated)

基于骨架的动作识别对于理解以人为中心的视频至关重要,并在各种领域具有应用价值。骨架动作识别的一个挑战是处理低质量数据,例如缺失或准确度不高的骨骼。本文通过一个通用的知识蒸馏框架来提高基于骨架的动作识别,该框架采用一个教师模型和一个学生模型。教师模型通过训练高质量骨骼来指导学习学生模型,学生模型处理低质量骨骼。为了弥合高质量和低质量骨骼之间的差距,我们提出了一个新颖的部分基于骨骼匹配策略,该策略利用共享身体部分来促进局部动作模式学习。为不同动作生成特定部分矩阵,强调关键部分以帮助学生模型蒸馏部分级别知识。一种新颖的部分级别多样本对比损失实现从多个高质量骨骼向低质量骨骼的知识传递,这使得所提出的知识蒸馏框架可以包括训练低质量骨骼,这些骨骼没有相应的高质量匹配。在NTU-RGB+D、Penn Action和SYSU 3D HOI数据集上进行全面的实验证明所提出的知识蒸馏框架的有效性。



3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot