Abstract
Remote photoplethysmography (rPPG), enabling non-contact physiological monitoring through facial light reflection analysis, faces critical computational bottlenecks as deep learning introduces performance gains at the cost of prohibitive resource demands. This paper proposes ME-rPPG, a memory-efficient algorithm built on temporal-spatial state space duality, which resolves the trilemma of model scalability, cross-dataset generalization, and real-time constraints. Leveraging a transferable state space, ME-rPPG efficiently captures subtle periodic variations across facial frames while maintaining minimal computational overhead, enabling training on extended video sequences and supporting low-latency inference. Achieving cross-dataset MAEs of 5.38 (MMPD), 0.70 (VitalVideo), and 0.25 (PURE), ME-rPPG outperforms all baselines with improvements ranging from 21.3% to 60.2%. Our solution enables real-time inference with only 3.6 MB memory usage and 9.46 ms latency -- surpassing existing methods by 19.5%-49.7% accuracy and 43.2% user satisfaction gains in real-world deployments. The code and demos are released for reproducibility on this https URL.
Abstract (translated)
远程光容积描记法(rPPG)通过面部光线反射分析实现非接触式生理监测,但随着深度学习带来的性能提升,也面临着资源需求过高的计算瓶颈。本文提出了一种基于时间-空间状态空间二元性的内存高效算法ME-rPPG,解决了模型可扩展性、跨数据集泛化能力和实时约束之间的三难问题。利用可转移的状态空间,ME-rPPG能够有效地捕捉面部帧间的细微周期变化,并保持最低的计算开销,从而支持长时间视频序列训练和低延迟推理。 在跨数据集均方误差(MAE)测试中,ME-rPPG分别达到了MMPD 5.38、VitalVideo 0.70 和 PURE 0.25 的成绩,并且超越了所有基线模型,改进幅度从21.3%到60.2%不等。我们的解决方案能够在仅使用3.6 MB内存和延迟9.46毫秒的情况下实现实时推理,在现实世界部署中比现有方法提高了19.5%-49.7%的准确率,并带来了用户满意度提升的43.2%。 代码与演示可在以下网址找到,以确保研究结果的可重复性:[URL](请将[URL]替换为实际链接)。
URL
https://arxiv.org/abs/2504.01774