Paper Reading AI Learner

Amur Tiger Re-identification in the Wild

2019-06-13 10:16:30
Shuyuan Li, Jianguo Li, Weiyao Lin, Hanlin Tang

Abstract

Monitoring the population and movements of endangered species is an important task to wildlife conversation. Traditional tagging methods do not scale to large populations, while applying computer vision methods to camera sensor data requires re-identification (re-ID) algorithms to obtain accurate counts and moving trajectory of wildlife. However, existing re-ID methods are largely targeted at persons and cars, which have limited pose variations and constrained capture environments. This paper tries to fill the gap by introducing a novel large-scale dataset, the Amur Tiger Re-identification in the Wild (ATRW) dataset. ATRW contains over 8,000 video clips from 92 Amur tigers, with bounding box, pose keypoint, and tiger identity annotations. In contrast to typical re-ID datasets, the tigers are captured in a diverse set of unconstrained poses and lighting conditions. We demonstrate with a set of baseline algorithms that ATRW is a challenging dataset for re-ID. Lastly, we propose a novel method for tiger re-identification, which introduces precise pose parts modeling in deep neural networks to handle large pose variation of tigers, and reaches notable performance improvement over existing re-ID methods. The dataset will be public available at https://cvwc2019.github.io/ .

Abstract (translated)

监测濒危物种的数量和活动是野生动物对话的重要任务。传统的标记方法不适用于大量人群,而将计算机视觉方法应用于摄像机传感器数据需要重新识别(RE ID)算法来获得准确的野生动物数量和运动轨迹。然而,现有的重新识别方法主要针对人和车,这些人和车具有有限的姿态变化和有限的捕获环境。本文试图通过引入一个新的大规模数据集来填补这一空白,即野生阿穆尔虎再鉴定数据集。ATRW包含来自92只阿穆尔虎的8000多个视频片段,带有边界框、姿势关键点和老虎身份注释。与典型的REID数据集相比,老虎被捕获在一组不同的无约束姿势和照明条件下。最后,我们提出了一种新的老虎再识别方法,该方法在深神经网络中引入了精确的姿态部件建模,以处理老虎的大姿态变化,并与现有的REID方法相比,取得了显著的性能改进。数据集将在https://cvwc2019.github.io/上公开。

URL

https://arxiv.org/abs/1906.05586

PDF

https://arxiv.org/pdf/1906.05586.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot