SAVCHOI: Detecting Suspicious Activities using Dense Video Captioning with Human Object Interactions

Abstract
Abstract (translated)
URL
PDF

Abstract

Detecting suspicious activities in surveillance videos has been a longstanding problem, which can further lead to difficulties in detecting crimes. The authors propose a novel approach for detecting and summarizing the suspicious activities going on in the surveillance videos. They also create ground truth summaries for the UCF-Crime video dataset. Further, the authors test existing state-of-the-art algorithms for Dense Video Captioning for a subset of this dataset and propose a model for this task by leveraging Human-Object Interaction models for the Visual features. They observe that this formulation for Dense Captioning achieves large gains over earlier approaches by a significant margin. The authors also perform an ablative analysis of the dataset and the model and report their findings.

Abstract (translated)

URL

https://arxiv.org/abs/2207.11838

PDF

https://arxiv.org/pdf/2207.11838.pdf