Neonatal Face and Facial Landmark Detection from Video Recordings

Abstract
Abstract (translated)
URL
PDF

Abstract

This paper explores automated face and facial landmark detection of neonates, which is an important first step in many video-based neonatal health applications, such as vital sign estimation, pain assessment, sleep-wake classification, and jaundice detection. Utilising three publicly available datasets of neonates in the clinical environment, 366 images (258 subjects) and 89 (66 subjects) were annotated for training and testing, respectively. Transfer learning was applied to two YOLO-based models, with input training images augmented with random horizontal flipping, photo-metric colour distortion, translation and scaling during each training epoch. Additionally, the re-orientation of input images and fusion of trained deep learning models was explored. Our proposed model based on YOLOv7Face outperformed existing methods with a mean average precision of 84.8% for face detection, and a normalised mean error of 0.072 for facial landmark detection. Overall, this will assist in the development of fully automated neonatal health assessment algorithms.

Abstract (translated)

本 paper 探讨了自动检测新生儿的面部和面部地标,这在许多基于视频的新生儿健康应用中是一个重要的的第一步,例如估计生命体征、评估疼痛、睡眠-清醒分类和检测黄疸。利用临床环境中公开可用的三个新生儿数据集,共进行了 366 张照片(258 名受试者)和 89 张照片(66 名受试者)的标注,用于训练和测试。 Transfer learning 应用于两个基于 YOLO 的模型,在每个训练 epoch 中,输入训练图像随机地进行水平翻转、photo-metric 颜色扭曲、旋转和缩放。此外,探索了输入图像的重新定向和训练深度神经网络的融合。我们提出的基于 YOLOv7Face 的模型在面部检测方面表现更好,面部地标检测的平均精度为 84.8%,均值误差为 0.072。Overall,这将协助开发完全自动化的新生儿健康评估算法。

URL

https://arxiv.org/abs/2302.04341

PDF

https://arxiv.org/pdf/2302.04341.pdf