SHD360: A Benchmark Dataset for Salient Human Detection in 360{deg} Videos

2021-05-24 23:51:29

Yi Zhang, Lu Zhang, Jing Zhang, Kang Wang, Wassim Hamidouche, Olivier Deforges

arXiv_CV

arXiv_CV Detection Object_Detection Salient Pose Action

Abstract
Abstract (translated)
URL
PDF

Abstract

Salient human detection (SHD) in dynamic 360° immersive videos is of great importance for various applications such as robotics, inter-human and human-object interaction in augmented reality. However, 360° video SHD has been seldom discussed in the computer vision community due to a lack of datasets with large-scale omnidirectional videos and rich annotations. To this end, we propose SHD360, the first 360° video SHD dataset collecting various real-life daily scenes, providing six-level hierarchical annotations for 6,268 key frames uniformly sampled from 37,403 omnidirectional video frames at 4K resolution. Specifically, each collected key frame is labeled with a super-class, a sub-class, associated attributes (e.g., geometrical distortion), bounding boxes and per-pixel object-/instance-level masks. As a result, our SHD360 contains totally 16,238 salient human instances with manually annotated pixel-wise ground truth. Since so far there is no method proposed for 360° SHD, we systematically benchmark 11 representative state-of-the-art salient object detection (SOD) approaches on our SHD360, and explore key issues derived from extensive experimenting results. We hope our proposed dataset and benchmark could serve as a good starting point for advancing human-centric researches towards 360° panoramic data. Our dataset and benchmark will be publicly available at this https URL.

Abstract (translated)

URL

https://arxiv.org/abs/2105.11578

PDF

https://arxiv.org/pdf/2105.11578.pdf