Paper Reading AI Learner

Probablistic Restoration with Adaptive Noise Sampling for 3D Human Pose Estimation

2024-05-03 14:14:27
Xianzhou Zeng, Hao Qin, Ming Kong, Luyuan Chen, Qiang Zhu

Abstract

The accuracy and robustness of 3D human pose estimation (HPE) are limited by 2D pose detection errors and 2D to 3D ill-posed challenges, which have drawn great attention to Multi-Hypothesis HPE research. Most existing MH-HPE methods are based on generative models, which are computationally expensive and difficult to train. In this study, we propose a Probabilistic Restoration 3D Human Pose Estimation framework (PRPose) that can be integrated with any lightweight single-hypothesis model. Specifically, PRPose employs a weakly supervised approach to fit the hidden probability distribution of the 2D-to-3D lifting process in the Single-Hypothesis HPE model and then reverse-map the distribution to the 2D pose input through an adaptive noise sampling strategy to generate reasonable multi-hypothesis samples effectively. Extensive experiments on 3D HPE benchmarks (Human3.6M and MPI-INF-3DHP) highlight the effectiveness and efficiency of PRPose. Code is available at: this https URL.

Abstract (translated)

3D人体姿态估计(HPE)的准确性和鲁棒性受到二维姿态检测错误和二维到三维非线性挑战的限制,这些已经引起了多假设性HPE研究的广泛关注。现有的MH-HPE方法都是基于生成模型的,这些模型计算代价高且训练困难。在这项研究中,我们提出了一个概率修复3D人体姿态估计框架(PRPose),可以与任何轻量级的单假设模型集成。具体来说,PRPose采用了一种弱监督方法来适应单假设HPE模型中2D-to-3D提升过程的隐藏概率分布,然后通过自适应噪声采样策略将分布反向映射到2D姿态输入,从而有效地生成合理的多个假设样本。在3D HPE基准(Human3.6M和MPI-INF-3DHP)上的大量实验揭示了PRPose的有效性和效率。代码可在此处下载:https://this URL。

URL

https://arxiv.org/abs/2405.02114

PDF

https://arxiv.org/pdf/2405.02114.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot