Paper Reading AI Learner

CCTV-Gun: Benchmarking Handgun Detection in CCTV Images

2023-03-19 16:17:35
Srikar Yellapragada, Zhenghong Li, Kevin Bhadresh Doshi, Purva Makarand Mhasakar, Heng Fan, Jie Wei, Erik Blasch, Haibin Ling

Abstract

Gun violence is a critical security problem, and it is imperative for the computer vision community to develop effective gun detection algorithms for real-world scenarios, particularly in Closed Circuit Television (CCTV) surveillance data. Despite significant progress in visual object detection, detecting guns in real-world CCTV images remains a challenging and under-explored task. Firearms, especially handguns, are typically very small in size, non-salient in appearance, and often severely occluded or indistinguishable from other small objects. Additionally, the lack of principled benchmarks and difficulty collecting relevant datasets further hinder algorithmic development. In this paper, we present a meticulously crafted and annotated benchmark, called \textbf{CCTV-Gun}, which addresses the challenges of detecting handguns in real-world CCTV images. Our contribution is three-fold. Firstly, we carefully select and analyze real-world CCTV images from three datasets, manually annotate handguns and their holders, and assign each image with relevant challenge factors such as blur and occlusion. Secondly, we propose a new cross-dataset evaluation protocol in addition to the standard intra-dataset protocol, which is vital for gun detection in practical settings. Finally, we comprehensively evaluate both classical and state-of-the-art object detection algorithms, providing an in-depth analysis of their generalizing abilities. The benchmark will facilitate further research and development on this topic and ultimately enhance security. Code, annotations, and trained models are available at this https URL.

Abstract (translated)

枪支暴力是一个关键的安全问题,计算机视觉社区必须开发有效的枪支检测算法,尤其是在闭路电视监控数据中。尽管视觉对象检测取得了重大进展,但在真实世界中检测枪支仍然是一项挑战性和未被充分研究的任务。步枪,特别是手枪,通常非常小,外观不显著,常常严重遮挡或与其他小物体难以区分。此外,缺乏原则性的基准和收集相关数据集的困难和难度进一步阻碍了算法开发。在本文中,我们提出了一个精心制作并注释的基准,称为 \textbf{CCTV-gun},解决了在真实世界中检测手枪的挑战。我们的贡献有三个:首先,我们仔细选择和分析三个数据集中的现实世界闭路电视图像,手动注释手枪及其持有者,并将每个图像与相关的挑战因素,如模糊和遮挡等分配。其次,我们提出了一种新的跨数据集评估协议,除了标准内部数据集协议,这对在实际应用中检测枪支非常重要。最后,我们全面评估了经典和最先进的物体检测算法,提供了对其通用性能力的深入分析。基准将促进进一步研究和发展这个话题,并最终提高安全。代码、注释和训练模型可在这个 https URL 上获取。

URL

https://arxiv.org/abs/2303.10703

PDF

https://arxiv.org/pdf/2303.10703.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot