Abstract
Building energy management (BEM) tasks require processing and learning from a variety of time-series data. Existing solutions rely on bespoke task- and data-specific models to perform these tasks, limiting their broader applicability. Inspired by the transformative success of Large Language Models (LLMs), Time-Series Foundation Models (TSFMs), trained on diverse datasets, have the potential to change this. Were TSFMs to achieve a level of generalizability across tasks and contexts akin to LLMs, they could fundamentally address the scalability challenges pervasive in BEM. To understand where they stand today, we evaluate TSFMs across four dimensions: (1) generalizability in zero-shot univariate forecasting, (2) forecasting with covariates for thermal behavior modeling, (3) zero-shot representation learning for classification tasks, and (4) robustness to performance metrics and varying operational conditions. Our results reveal that TSFMs exhibit \emph{limited} generalizability, performing only marginally better than statistical models on unseen datasets and modalities for univariate forecasting. Similarly, inclusion of covariates in TSFMs does not yield performance improvements, and their performance remains inferior to conventional models that utilize covariates. While TSFMs generate effective zero-shot representations for downstream classification tasks, they may remain inferior to statistical models in forecasting when statistical models perform test-time fitting. Moreover, TSFMs forecasting performance is sensitive to evaluation metrics, and they struggle in more complex building environments compared to statistical models. These findings underscore the need for targeted advancements in TSFM design, particularly their handling of covariates and incorporating context and temporal dynamics into prediction mechanisms, to develop more adaptable and scalable solutions for BEM.
Abstract (translated)
建筑能源管理(BEM)任务需要处理和从多种时间序列数据中学习。现有的解决方案依赖于定制的任务特定和数据特定模型来执行这些任务,这限制了它们的广泛适用性。受大型语言模型(LLM)变革成功的启发,基于多样数据集训练的时间序列基础模型(TSFM),有潜力改变这一现状。如果TSFMs能够在零样本单变量预测、带协变量热行为建模预测、零样本表示学习分类任务以及对性能指标和不同操作条件的鲁棒性方面达到类似LLM的跨任务和上下文泛化能力,它们可以从根本上解决BEM中的可扩展性挑战。为了了解当前TSFMs的位置,我们在四个维度上评估了它们:(1)零样本单变量预测的泛化能力;(2)用于热行为建模带协变量的预测;(3)下游分类任务中零样本表示学习的有效性;以及(4)对性能指标和不同操作条件的鲁棒性。我们的研究结果表明,TSFMs在零样本单变量预测中的泛化能力有限,在未见过的数据集和模式上的表现仅略好于统计模型。同样地,将协变量纳入TSFM中并没有带来性能提升,并且它们的表现仍低于使用协变量的传统模型。尽管TSFMs可以生成有效的零样本表示用于下游分类任务,但当统计模型进行测试时间拟合时,它们在预测方面可能仍然不如统计模型。此外,TSFMs的预测表现对评估指标敏感,在更复杂的建筑环境中相比统计模型表现出更大的困难。这些发现强调了需要针对TSFM设计的针对性改进,特别是其处理协变量、将上下文和时间动态融入预测机制的能力,以开发出更加适应性和可扩展性的BEM解决方案。
URL
https://arxiv.org/abs/2506.11250