Abstract
Solving jigsaw puzzles requires to grasp the visual features of a sequence of patches and to explore efficiently a solution space that grows exponentially with the sequence length. Therefore, visual deep reinforcement learning (DRL) should answer this problem more efficiently than optimization solvers coupled with neural networks. Based on this assumption, we introduce Alphazzle, a reassembly algorithm based on single-player Monte Carlo Tree Search (MCTS). A major difference with DRL algorithms lies in the unavailability of game reward for MCTS, and we show how to estimate it from the visual input with neural networks. This constraint is induced by the puzzle-solving task and dramatically adds to the task complexity (and interest!). We perform an in-deep ablation study that shows the importance of MCTS and the neural networks working together. We achieve excellent results and get exciting insights into the combination of DRL and visual feature learning.
Abstract (translated)
解决拼图游戏需要抓住一组碎片的视觉特征,并高效地探索随着序列长度呈指数增长的解决方案空间。因此,视觉深度强化学习(DRL)应该比结合神经网络的优化求解器更有效地解决这个问题。基于这一假设,我们介绍了Alphazzle,这是一个基于单人蒙特卡罗树搜索(MCTS)的重新组装算法。与DRL算法的主要区别在于MCTS游戏中的奖励不可用,我们展示了如何使用神经网络从视觉输入中估计它。这个限制是由解决拼图游戏任务引起的,它极大地增加了任务的复杂性(并增加了兴趣)。我们进行了深度去基化研究,表明MCTS和神经网络一起工作的重要性。我们取得了出色的结果,并获得了DRL和视觉特征学习的结合令人兴奋的洞察力。
URL
https://arxiv.org/abs/2302.00384