Abstract
Why do Recurrent State Space Models such as PlaNet fail at cloth manipulation tasks? Recent work has attributed this to the blurry reconstruction of the observation, which makes it difficult to plan directly in the latent space. This paper explores the reasons behind this by applying PlaNet in the pick-and-place cloth-flattening domain. We find that the sharp discontinuity of the transition function on the contour of the article makes it difficult to learn an accurate latent dynamic model. By adopting KL balancing and latent overshooting in the training loss and adjusting the planned picking position to the closest part of the cloth, we show that the updated PlaNet-Pick model can achieve state-of-the-art performance using latent MPC algorithms in simulation.
Abstract (translated)
为什么循环状态空间模型(如 PlaNet)在衣物操作任务中失败?最近的研究表明,这可能是由于观察的模糊重构导致的,这使得在潜在空间中直接计划变得困难。本论文通过在挑选和放置衣物平移领域的 PlaNet 应用来探索这个问题的原因。我们发现,文章轮廓上的导数函数的尖锐中断使学习准确的潜在动态模型变得困难。通过在训练损失中采用KL平衡和潜在过度估计,并将计划选取位置调整至衣物最接近的部分,我们表明,更新的 PlaNet-挑选模型可以使用潜在 MPC 算法在模拟中实现最先进的性能。
URL
https://arxiv.org/abs/2303.01345