RhythmNet: End-to-end Heart Rate Estimation from Face via Spatial-temporal Representation

2019-10-25 04:03:41

Xuesong Niu, Hu Han, Shiguang Shan, Xilin Chen

arXiv_CV

arXiv_CV CNN Face Relation Pose Emotion

Abstract
Abstract (translated)
URL
PDF

Abstract

Heart rate (HR) is an important physiological signal that reflects the physical and emotional status of a person. Traditional HR measurements usually rely on contact monitors, which may cause inconvenience and discomfort. Recently, some methods have been proposed for remote HR estimation from face videos; however, most of them focus on well-controlled scenarios, their generalization ability into less-constrained scenarios (e.g., with head movement, and bad illumination) are not known. At the same time, lacking large-scale HR databases has limited the use of deep models for remote HR estimation. In this paper, we propose an end-to-end RhythmNet for remote HR estimation from the face. In RyhthmNet, we use a spatial-temporal representation encoding the HR signals from multiple ROI volumes as its input. Then the spatial-temporal representations are fed into a convolutional network for HR estimation. We also take into account the relationship of adjacent HR measurements from a video sequence via Gated Recurrent Unit (GRU) and achieves efficient HR measurement. In addition, we build a large-scale multi-modal HR database (named as VIPL-HR, available at '<a href="http://vipl.ict.ac.cn/view_database.php?id=15'">this http URL</a>), which contains 2,378 visible light videos (VIS) and 752 near-infrared (NIR) videos of 107 subjects. Our VIPL-HR database contains various variations such as head movements, illumination variations, and acquisition device changes, replicating a less-constrained scenario for HR estimation. The proposed approach outperforms the state-of-the-art methods on both the public-domain and our VIPL-HR databases.

Abstract (translated)

URL

https://arxiv.org/abs/1910.11515

PDF

https://arxiv.org/pdf/1910.11515.pdf

RhythmNet: End-to-end Heart Rate Estimation from Face via Spatial-temporal Representation

Abstract

Abstract (translated)

URL

PDF Copy

PDF