Abstract
This paper presents a novel learning approach for Dubins Traveling Salesman Problems(DTSP) with Neighborhood (DTSPN) to quickly produce a tour of a non-holonomic vehicle passing through neighborhoods of given task points. The method involves two learning phases: initially, a model-free reinforcement learning approach leverages privileged information to distill knowledge from expert trajectories generated by the LinKernighan heuristic (LKH) algorithm. Subsequently, a supervised learning phase trains an adaptation network to solve problems independently of privileged information. Before the first learning phase, a parameter initialization technique using the demonstration data was also devised to enhance training efficiency. The proposed learning method produces a solution about 50 times faster than LKH and substantially outperforms other imitation learning and RL with demonstration schemes, most of which fail to sense all the task points.
Abstract (translated)
本文提出了一种名为Neighborhood-based Traveling Salesman Problem (DTSP)的新的学习方法,用于通过给定的任务点快速生成非holonomic车辆通过邻近区域的周游路线。该方法包括两个学习阶段:首先,一种模型无关的强化学习方法利用特权信息从由LinKernighan启发式(LKH)算法生成的专家轨迹中提炼知识。随后,一个监督学习阶段训练一个自适应网络,以独立于特权信息解决问题。在第一个学习阶段之前,还开发了一种使用演示数据进行参数初始化的技术,以提高训练效率。与LKH相比,所提出的学习方法解决方案大约快50倍,并且比其他演示学习方法和RL取得了显著的优越性,大多数这些方法无法感知所有任务点。
URL
https://arxiv.org/abs/2404.16721