Mathematical Foundation and Corrections for Full Range Head Pose Estimation

Abstract
Abstract (translated)
URL
PDF

Abstract

Numerous works concerning head pose estimation (HPE) offer algorithms or proposed neural network-based approaches for extracting Euler angles from either facial key points or directly from images of the head region. However, many works failed to provide clear definitions of the coordinate systems and Euler or Tait-Bryan angles orders in use. It is a well-known fact that rotation matrices depend on coordinate systems, and yaw, roll, and pitch angles are sensitive to their application order. Without precise definitions, it becomes challenging to validate the correctness of the output head pose and drawing routines employed in prior works. In this paper, we thoroughly examined the Euler angles defined in the 300W-LP dataset, head pose estimation such as 3DDFA-v2, 6D-RepNet, WHENet, etc, and the validity of their drawing routines of the Euler angles. When necessary, we infer their coordinate system and sequence of yaw, roll, pitch from provided code. This paper presents (1) code and algorithms for inferring coordinate system from provided source code, code for Euler angle application order and extracting precise rotation matrices and the Euler angles, (2) code and algorithms for converting poses from one rotation system to another, (3) novel formulae for 2D augmentations of the rotation matrices, and (4) derivations and code for the correct drawing routines for rotation matrices and poses. This paper also addresses the feasibility of defining rotations with right-handed coordinate system in Wikipedia and SciPy, which makes the Euler angle extraction much easier for full-range head pose research.

Abstract (translated)

许多关于头姿态估计（HPE）的作品提供了算法或基于神经网络的提取欧拉角度的方法，其中许多作品没有明确定义使用时的坐标系和欧拉或泰特-布莱尼安角度的顺序。众所周知，旋转矩阵取决于坐标系，而俯仰、滚转和偏航角度对应用顺序非常敏感。如果没有精确的定义，则很难验证之前工作中使用的输出头姿和绘制算法的正确性。在本文中，我们对300W-LP数据集中的欧拉角度进行了深入研究，包括3DDFA-v2、6D-RepNet、WHENet等头姿态估计方法，以及它们提取欧拉角度的绘制算法的验证。必要时，我们从提供的代码中推断它们的坐标系和俯仰、滚转、偏航的序列。本文提出了以下内容：（1）从提供源代码中推断坐标系和欧拉角度的代码和算法；（2）将姿态从一个旋转系统中转换到另一个旋转系统中的代码和算法；（3）用于旋转矩阵的2D增强公式；（4）用于绘制欧拉角度和姿态的准确绘制算法的推导和代码。本文还讨论了在维基百科和SciPy中使用右手法则定义旋转的可行性，这使得全范围头姿研究中的欧拉角度提取变得容易得多。

URL

https://arxiv.org/abs/2403.18104

PDF

https://arxiv.org/pdf/2403.18104.pdf