Abstract
Monocular egocentric 3D human motion capture remains a significant challenge, particularly under conditions of low lighting and fast movements, which are common in head-mounted device applications. Existing methods that rely on RGB cameras often fail under these conditions. To address these limitations, we introduce EventEgo3D++, the first approach that leverages a monocular event camera with a fisheye lens for 3D human motion capture. Event cameras excel in high-speed scenarios and varying illumination due to their high temporal resolution, providing reliable cues for accurate 3D human motion capture. EventEgo3D++ leverages the LNES representation of event streams to enable precise 3D reconstructions. We have also developed a mobile head-mounted device (HMD) prototype equipped with an event camera, capturing a comprehensive dataset that includes real event observations from both controlled studio environments and in-the-wild settings, in addition to a synthetic dataset. Additionally, to provide a more holistic dataset, we include allocentric RGB streams that offer different perspectives of the HMD wearer, along with their corresponding SMPL body model. Our experiments demonstrate that EventEgo3D++ achieves superior 3D accuracy and robustness compared to existing solutions, even in challenging conditions. Moreover, our method supports real-time 3D pose updates at a rate of 140Hz. This work is an extension of the EventEgo3D approach (CVPR 2024) and further advances the state of the art in egocentric 3D human motion capture. For more details, visit the project page at this https URL.
Abstract (translated)
单目第一人称视角的3D人体动作捕捉仍然是一个重大挑战,特别是在低光照和快速运动条件下,这些条件在头戴式设备应用中非常常见。现有的依赖RGB摄像头的方法在这种情况下往往效果不佳。为了克服这些限制,我们引入了EventEgo3D++,这是首个利用单目事件相机(配备鱼眼镜头)进行3D人体动作捕捉的技术方法。由于其高时间分辨率,事件相机在高速场景和变化光照条件下表现出色,能够提供准确的3D人体运动捕捉所需的可靠线索。EventEgo3D++通过利用事件流的LNES表示法来实现精确的三维重建。我们还开发了一款配备事件摄像头的移动头戴式设备(HMD)原型机,并采集了一个全面的数据集,其中包括从受控工作室环境和野外设置中收集的真实事件观察数据以及合成数据集。为了提供一个更为综合的数据集,我们也加入了以不同视角捕捉HMD佩戴者的第一人称RGB视频流,同时包含与其对应的SMPL人体模型。 我们的实验表明,EventEgo3D++在各种挑战条件下实现了比现有解决方案更优的三维精度和鲁棒性,并且能够支持每秒140帧的速度实时更新三维姿态。这项工作是针对CVPR 2024年提出的方法——EventEgo3D的进一步发展,在第一人称视角下的人体运动捕捉领域推进了技术前沿。 欲了解更多信息,请访问项目主页:[此链接](https://example.com/project-page)(请将"this https URL"替换为实际链接)。
URL
https://arxiv.org/abs/2502.07869