Paper Reading AI Learner

Shared learning of powertrain control policies for vehicle fleets

2024-04-27 13:01:05
Lindsey Kerbel, Beshah Ayalew, Andrej Ivanco

Abstract

Emerging data-driven approaches, such as deep reinforcement learning (DRL), aim at on-the-field learning of powertrain control policies that optimize fuel economy and other performance metrics. Indeed, they have shown great potential in this regard for individual vehicles on specific routes or drive cycles. However, for fleets of vehicles that must service a distribution of routes, DRL approaches struggle with learning stability issues that result in high variances and challenge their practical deployment. In this paper, we present a novel framework for shared learning among a fleet of vehicles through the use of a distilled group policy as the knowledge sharing mechanism for the policy learning computations at each vehicle. We detail the mathematical formulation that makes this possible. Several scenarios are considered to analyze the functionality, performance, and computational scalability of the framework with fleet size. Comparisons of the cumulative performance of fleets using our proposed shared learning approach with a baseline of individual learning agents and another state-of-the-art approach with a centralized learner show clear advantages to our approach. For example, we find a fleet average asymptotic improvement of 8.5 percent in fuel economy compared to the baseline while also improving on the metrics of acceleration error and shifting frequency for fleets serving a distribution of suburban routes. Furthermore, we include demonstrative results that show how the framework reduces variance within a fleet and also how it helps individual agents adapt better to new routes.

Abstract (translated)

新兴的数据驱动方法,如深度强化学习(DRL),旨在在实况中学习动力电池控制策略,以优化燃料经济和其他性能指标。事实上,它们在个别车辆或特定的路线/驾驶周期方面已经表现出巨大的潜力。然而,对于必须为分布路线服务的车队,DRL方法在应对导致高方差的学习稳定性问题方面遇到困难,这使得它们的实际部署受到挑战。在本文中,我们提出了一个新颖的框架,通过在每辆车之间共享学习,实现车队内车辆之间的知识共享,以进行政策学习计算。我们详细介绍了实现这一目标的数学公式。在分析框架的功能、性能和计算可扩展性方面考虑了几种情景。与单独学习代理的基线和另一个最先进的集中学习方法进行比较,我们的共享学习方法展示了明显的优势。例如,我们发现在燃料经济方面,与基线相比,车队平均增益达到8.5%。同时,还改善了为郊区路线服务的车队的加速度误差和转移频率指标。此外,我们还包括了一些示例结果,展示了框架如何减少车队内的方差,以及如何帮助个体代理更好地适应新路线。

URL

https://arxiv.org/abs/2404.17892

PDF

https://arxiv.org/pdf/2404.17892.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot