Abstract
Visual control policies can encounter significant performance degradation when visual conditions like lighting or camera position differ from those seen during training -- often exhibiting sharp declines in capability even for minor differences. In this work, we examine robustness to a suite of these types of visual changes for RGB-D and point cloud based visual control policies. To perform these experiments on both model-free and model-based reinforcement learners, we introduce a novel Point Cloud World Model (PCWM) and point cloud based control policies. Our experiments show that policies that explicitly encode point clouds are significantly more robust than their RGB-D counterparts. Further, we find our proposed PCWM significantly outperforms prior works in terms of sample efficiency during training. Taken together, these results suggest reasoning about the 3D scene through point clouds can improve performance, reduce learning time, and increase robustness for robotic learners. Project Webpage: this https URL
Abstract (translated)
视觉控制策略在视觉条件与训练时观察到的条件不同的情况下,可能会遇到显著的性能降级。在本文中,我们研究了基于RGB-D和点云的视觉控制策略对这类视觉变化的鲁棒性。为了在模型无关和基于模型的强化学习上进行这些实验,我们引入了一种新颖的点云世界模型(PCWM)和基于点云的控制策略。我们的实验结果表明,明确编码点云的策略比它们的RGB-D对应策略更稳健。此外,我们还发现在训练过程中,我们的PCWM显著优于先前的作品,具有更高的训练样本效率。结合这些结果,我们可以得出这样的结论:通过点云来推理3D场景可以提高机器学习者的性能、降低学习时间,并增加其鲁棒性。项目网页:https:// this URL
URL
https://arxiv.org/abs/2404.18926