Abstract
Uncertainty of environments has long been a difficult characteristic to handle, when performing real-world robot tasks. This is because the uncertainty produces unexpected observations that cannot be covered by manual scripting. Learning based robot controlling methods are a promising approach for generating flexible motions against unknown situations, but still tend to suffer under uncertainty due to its deterministic nature. In order to adaptively perform the target task under such conditions, the robot control model must be able to accurately understand the possible uncertainty, and to exploratively derive the optimal action that minimizes such uncertainty. This paper extended an existing predictive learning based robot control method, which employ foresight prediction using dynamic internal simulation. The foresight module refines the model's hidden states by sampling multiple possible futures and replace with the one that led to the lower future uncertainty. The adaptiveness of the model was evaluated on a door opening task. The door can be opened either by pushing, pulling, or sliding, but robot cannot visually distinguish which way, and is required to adapt on the fly. The results showed that the proposed model adaptively diverged its motion through interaction with the door, whereas conventional methods failed to stably diverge. The models were analyzed on Lyapunov exponents of RNN hidden states which reflect the possible divergence at each time step during task execution. The result indicated that the foresight module biased the model to consider future consequences, which lead to embedding uncertainties at the policy of the robot controller, rather than the resultant observation. This is beneficial for implementing adaptive behaviors, which indices derivation of diverse motion during exploration.
Abstract (translated)
不确定环境的特点一直是一个难以处理的问题,在进行现实世界的机器人任务时。这是因为不确定性会产生无法通过手动脚本预测的意外观察结果。基于机器学习控制方法是一种有前途的方法,可以生成对抗未知情况的灵活运动,但是由于其确定性 nature,仍然容易在不确定性条件下遭受挫折。为了在类似情况下适应执行目标任务,机器人控制模型必须能够准确理解可能的不确定性,并通过探索性推理得出最小不确定性的最优动作。本文在基于预测学习的现有机器人控制方法上进行了扩展,该方法使用动态内部仿真使用前瞻性预测。前瞻性模块通过采样多个可能的未来并替换为导致较低未来不确定性的未来来优化模型的隐藏状态。对模型的适应性进行了评估,该模型通过与门相互作用而适应性地改变其运动方式,而传统方法则无法在每次任务执行过程中稳定地改变。对模型的分析是在Lyapunov指数上进行的,这些指数反映了在任务执行过程中每个时间点的可能变化。结果表明,基于前瞻性的方法使模型更倾向于考虑未来的后果,从而将不确定性在机器人控制器的行为策略上进行编码,而不是在观察结果上。这对于实现自适应行为非常有利,这些行为指数可以衡量在探索过程中不同动作的衍生运动。
URL
https://arxiv.org/abs/2410.00774