Abstract
The reliability of data-driven applications in electric vehicle (EV) infrastructure, such as charging demand forecasting, hinges on the availability of complete, high-quality charging data. However, real-world EV datasets are often plagued by missing records, and existing imputation methods are ill-equipped for the complex, multimodal context of charging data, often relying on a restrictive one-model-per-station paradigm that ignores valuable inter-station correlations. To address these gaps, we develop a novel PRobabilistic variational imputation framework that leverages the power of large lAnguage models and retrIeval-augmented Memory (PRAIM). PRAIM employs a pre-trained language model to encode heterogeneous data, spanning time-series demand, calendar features, and geospatial context, into a unified, semantically rich representation. This is dynamically fortified by retrieval-augmented memory that retrieves relevant examples from the entire charging network, enabling a single, unified imputation model empowered by variational neural architecture to overcome data sparsity. Extensive experiments on four public datasets demonstrate that PRAIM significantly outperforms established baselines in both imputation accuracy and its ability to preserve the original data's statistical distribution, leading to substantial improvements in downstream forecasting performance.
Abstract (translated)
电动汽车(EV)基础设施中数据驱动应用的可靠性,例如充电需求预测,取决于完整且高质量充电数据的存在。然而,在现实世界中的电动汽车数据集经常会出现记录缺失的情况,而现有的数据填补方法对于复杂的多模式充电数据环境来说显得不足,这些方法通常依赖于每个充电站一个模型的限制性范式,忽略了有价值的跨站相关性。 为了解决这些问题,我们开发了一个新颖的概率变分填补框架——PRobabilistic variational imputation framework that leverages the power of large lAnguage models and retrIeval-augmented Memory(PRAIM)。PRAIM 使用预训练的语言模型将异构数据编码为统一的、语义丰富的表示,包括时间序列需求、日历特征和地理空间背景。该框架通过检索增强型记忆动态地从整个充电网络中提取相关示例,使单个统一填补模型能够克服数据稀疏性问题。 在四个公开数据集上进行的广泛实验表明,PRAIM 在填补准确性以及保留原始数据统计分布方面显著超越了现有的基准方法。这导致下游预测性能有了实质性的改进。
URL
https://arxiv.org/abs/2601.13476