Memory-efficient Low-latency Remote Photoplethysmography through Temporal-Spatial State Space Duality

Abstract
Abstract (translated)
URL
PDF

Abstract

Remote photoplethysmography (rPPG), enabling non-contact physiological monitoring through facial light reflection analysis, faces critical computational bottlenecks as deep learning introduces performance gains at the cost of prohibitive resource demands. This paper proposes ME-rPPG, a memory-efficient algorithm built on temporal-spatial state space duality, which resolves the trilemma of model scalability, cross-dataset generalization, and real-time constraints. Leveraging a transferable state space, ME-rPPG efficiently captures subtle periodic variations across facial frames while maintaining minimal computational overhead, enabling training on extended video sequences and supporting low-latency inference. Achieving cross-dataset MAEs of 5.38 (MMPD), 0.70 (VitalVideo), and 0.25 (PURE), ME-rPPG outperforms all baselines with improvements ranging from 21.3% to 60.2%. Our solution enables real-time inference with only 3.6 MB memory usage and 9.46 ms latency -- surpassing existing methods by 19.5%-49.7% accuracy and 43.2% user satisfaction gains in real-world deployments. The code and demos are released for reproducibility on this https URL.

Abstract (translated)

远程光容积描记法（rPPG）通过面部光线反射分析实现非接触式生理监测，但随着深度学习带来的性能提升，也面临着资源需求过高的计算瓶颈。本文提出了一种基于时间-空间状态空间二元性的内存高效算法ME-rPPG，解决了模型可扩展性、跨数据集泛化能力和实时约束之间的三难问题。利用可转移的状态空间，ME-rPPG能够有效地捕捉面部帧间的细微周期变化，并保持最低的计算开销，从而支持长时间视频序列训练和低延迟推理。在跨数据集均方误差（MAE）测试中，ME-rPPG分别达到了MMPD 5.38、VitalVideo 0.70 和 PURE 0.25 的成绩，并且超越了所有基线模型，改进幅度从21.3%到60.2%不等。我们的解决方案能够在仅使用3.6 MB内存和延迟9.46毫秒的情况下实现实时推理，在现实世界部署中比现有方法提高了19.5%-49.7%的准确率，并带来了用户满意度提升的43.2%。代码与演示可在以下网址找到，以确保研究结果的可重复性：[URL]（请将[URL]替换为实际链接）。

URL

https://arxiv.org/abs/2504.01774

PDF

https://arxiv.org/pdf/2504.01774.pdf

Memory-efficient Low-latency Remote Photoplethysmography through Temporal-Spatial State Space Duality

Abstract

Abstract (translated)

URL

PDF Copy

PDF