Paper Reading AI Learner

MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts

2024-05-02 06:02:07
Jianan Zhou, Zhiguang Cao, Yaoxin Wu, Wen Song, Yining Ma, Jie Zhang, Chi Xu

Abstract

Learning to solve vehicle routing problems (VRPs) has garnered much attention. However, most neural solvers are only structured and trained independently on a specific problem, making them less generic and practical. In this paper, we aim to develop a unified neural solver that can cope with a range of VRP variants simultaneously. Specifically, we propose a multi-task vehicle routing solver with mixture-of-experts (MVMoE), which greatly enhances the model capacity without a proportional increase in computation. We further develop a hierarchical gating mechanism for the MVMoE, delivering a good trade-off between empirical performance and computational complexity. Experimentally, our method significantly promotes the zero-shot generalization performance on 10 unseen VRP variants, and showcases decent results on the few-shot setting and real-world benchmark instances. We further provide extensive studies on the effect of MoE configurations in solving VRPs. Surprisingly, the hierarchical gating can achieve much better out-of-distribution generalization performance. The source code is available at: this https URL.

Abstract (translated)

学习解决车辆路由问题(VRPs)已经引起了广泛关注。然而,大多数神经网络解决方案只能在特定问题上进行结构化和训练,使它们对其他问题不具有通用性和实用性。在本文中,我们旨在开发一个统一的神经网络解决方案,可以同时处理多种 VRP 变体。具体来说,我们提出了一个多任务车辆路由解决方案(MVMoE),极大地增强了模型能力,而不会增加计算成本。我们进一步开发了一个分层的 gate 机制,使得 MVMoE 可以实现良好的实证性能和计算复杂度的平衡。实验表明,我们的方法在未见过的 10 个 VRP 变体上显著促进了零散样本通用性能,在几见过的设置和现实世界的基准实例上的表现也相当不错。我们还对 MoE 配置对解决 VRP 的影响进行了广泛研究。令人惊讶的是,分层门控可以实现更好的离散样本通用性能。代码可在此处下载:https:// this URL。

URL

https://arxiv.org/abs/2405.01029

PDF

https://arxiv.org/pdf/2405.01029.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot