Paper Reading AI Learner

LWIRPOSE: A novel LWIR Thermal Image Dataset and Benchmark

2024-04-16 01:49:35
Avinash Upadhyay, Bhipanshu Dhupar, Manoj Sharma, Ankit Shukla, Ajith Abraham

Abstract

Human pose estimation faces hurdles in real-world applications due to factors like lighting changes, occlusions, and cluttered environments. We introduce a unique RGB-Thermal Nearly Paired and Annotated 2D Pose Dataset, comprising over 2,400 high-quality LWIR (thermal) images. Each image is meticulously annotated with 2D human poses, offering a valuable resource for researchers and practitioners. This dataset, captured from seven actors performing diverse everyday activities like sitting, eating, and walking, facilitates pose estimation on occlusion and other challenging scenarios. We benchmark state-of-the-art pose estimation methods on the dataset to showcase its potential, establishing a strong baseline for future research. Our results demonstrate the dataset's effectiveness in promoting advancements in pose estimation for various applications, including surveillance, healthcare, and sports analytics. The dataset and code are available at this https URL

Abstract (translated)

由于因素如光照变化、遮挡和杂乱环境,人体姿态估计在现实应用中面临挑战。我们引入了一个独特的RGB-Thermal Nearly Paired和注释的2D人体姿态数据集,包括超过2400张高质量的LWIR(热)图像。每张图像都精心注释了2D人体姿态,为研究者和技术人员提供了一个宝贵的资源。这个数据集从七名演员在多样日常活动(如坐、吃、走)中拍摄获取,有助于在遮挡和其他具有挑战性的场景中进行姿态估计。我们在数据集上对最先进的姿态估计方法进行基准,以展示其潜力,并为未来的研究建立了一个强大的基线。我们的结果表明,该数据集在推动各种应用中人体姿态估计的进步方面非常有效,包括监视、医疗和体育分析。数据集和代码都可以在以下链接中找到:https://www.链接

URL

https://arxiv.org/abs/2404.10212

PDF

https://arxiv.org/pdf/2404.10212.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot