Abstract
The fifth generation (5G) of wireless networks is set out to meet the stringent requirements of vehicular use cases. Edge computing resources can aid in this direction by moving processing closer to end-users, reducing latency. However, given the stochastic nature of traffic loads and availability of physical resources, appropriate auto-scaling mechanisms need to be employed to support cost-efficient and performant services. To this end, we employ Deep Reinforcement Learning (DRL) for vertical scaling in Edge computing to support vehicular-to-network communications. We address the problem using Deep Deterministic Policy Gradient (DDPG). As DDPG is a model-free off-policy algorithm for learning continuous actions, we introduce a discretization approach to support discrete scaling actions. Thus we address scalability problems inherent to high-dimensional discrete action spaces. Employing a real-world vehicular trace data set, we show that DDPG outperforms existing solutions, reducing (at minimum) the average number of active CPUs by 23% while increasing the long-term reward by 24%.
Abstract (translated)
第五代无线网络(5G)的目标是满足汽车使用场景的严格要求。Edge计算资源可以通过将处理更接近最终用户,减少延迟来协助这一方向。然而,由于交通负载和物理资源的随机性质,需要使用适当的自适应性 scaling 机制来支持高效的服务。为此,我们在 Edge 计算中采用 Deep Reinforcement Learning(DRL)进行垂直扩展,以支持汽车到网络通信。我们使用 Deep Deterministic Policy Gradient(DDPG)来解决这个问题。DDPG 是一种无模型的离线 policy 算法,用于学习连续行动,我们引入了离散化方法来支持离散扩展行动。因此,我们解决了高维离散行动空间固有的 scalability 问题。使用真实的汽车轨迹数据集,我们表明 DDPG 优于现有解决方案,至少减少了平均活跃 CPU 数量23%,同时增加了长期奖励24%。
URL
https://arxiv.org/abs/2301.13324