Abstract
Autonomous nano-drones (~10 cm in diameter), thanks to their ultra-low power TinyML-based brains, are capable of coping with real-world environments. However, due to their simplified sensors and compute units, they are still far from the sense-and-act capabilities shown in their bigger counterparts. This system paper presents a novel deep learning-based pipeline that fuses multi-sensorial input (i.e., low-resolution images and 8x8 depth map) with the robot's state information to tackle a human pose estimation task. Thanks to our design, the proposed system -- trained in simulation and tested on a real-world dataset -- improves a state-unaware State-of-the-Art baseline by increasing the R^2 regression metric up to 0.10 on the distance's prediction.
Abstract (translated)
翻译: 自主纳米无人机(直径约10厘米),由于其基于极低功耗的TinyML大脑,能够应对真实世界环境。然而,由于其简单的传感器和计算单元,它们与较大 counterparts 相比还有很长的路要走。本文系统论文提出了一种新颖的深度学习-based 管道,将多感官输入(即低分辨率图像和8x8 深度图)与机器状态信息相结合,以解决人体姿态估计任务。感谢我们的设计,与在模拟环境中训练并在真实世界数据集上测试相比,所提出的系统 - 通过增加距离预测的 R^2 回归指标 - 提高了不知状态下的最先进基线的性能。
URL
https://arxiv.org/abs/2404.02567