Paper Reading AI Learner

Privacy-Preserving Video Anomaly Detection: A Survey

2024-11-21 20:29:59
Jing Liu, Yang Liu, Xiaoguang Zhu

Abstract

Video Anomaly Detection (VAD) aims to automatically analyze spatiotemporal patterns in surveillance videos collected from open spaces to detect anomalous events that may cause harm without physical contact. However, vision-based surveillance systems such as closed-circuit television often capture personally identifiable information. The lack of transparency and interpretability in video transmission and usage raises public concerns about privacy and ethics, limiting the real-world application of VAD. Recently, researchers have focused on privacy concerns in VAD by conducting systematic studies from various perspectives including data, features, and systems, making Privacy-Preserving Video Anomaly Detection (P2VAD) a hotspot in the AI community. However, current research in P2VAD is fragmented, and prior reviews have mostly focused on methods using RGB sequences, overlooking privacy leakage and appearance bias considerations. To address this gap, this article systematically reviews the progress of P2VAD for the first time, defining its scope and providing an intuitive taxonomy. We outline the basic assumptions, learning frameworks, and optimization objectives of various approaches, analyzing their strengths, weaknesses, and potential correlations. Additionally, we provide open access to research resources such as benchmark datasets and available code. Finally, we discuss key challenges and future opportunities from the perspectives of AI development and P2VAD deployment, aiming to guide future work in the field.

Abstract (translated)

视频异常检测(Video Anomaly Detection,简称VAD)旨在自动分析来自开放空间监控视频中的时空模式,以检测可能造成无物理接触危害的异常事件。然而,基于视觉的监视系统如闭路电视常常捕获个人可识别信息。在视频传输和使用中缺乏透明度和解释性引发了公众对于隐私和伦理的关注,限制了VAD的实际应用。近期,研究人员通过从数据、特征和系统等多个角度进行系统的研究来关注VAD中的隐私问题,使得保护隐私的视频异常检测(Privacy-Preserving Video Anomaly Detection, 简称P2VAD)成为人工智能社区的一个热点话题。然而,当前P2VAD领域的研究较为零散,以往的研究回顾大多集中于使用RGB序列的方法,忽视了隐私泄露和外观偏见等问题。为解决这一问题,本文首次系统地综述了P2VAD的发展状况,定义其范围并提供直观的分类法。我们概述了各种方法的基本假设、学习框架及优化目标,并分析它们的优势、劣势及其潜在关联性。此外,我们提供了诸如基准数据集和可用代码等研究资源的开放访问。最后,从人工智能发展与P2VAD部署的角度出发,讨论了关键挑战与未来机遇,旨在为该领域的未来发展提供指导。

URL

https://arxiv.org/abs/2411.14565

PDF

https://arxiv.org/pdf/2411.14565.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot