Time series forecasting is essential for operational intelligence in the hospitality industry, and particularly challenging in large-scale, distributed systems. This study evaluates the performance of statistical, machine learning (ML), deep learning, and foundation models in forecasting hourly sales over a 14-day horizon using real-world data from a network of thousands of restaurants across Germany. The forecasting solution includes features such as weather conditions, calendar events, and time-of-day patterns. Results demonstrate the strong performance of ML-based meta-models and highlight the emerging potential of foundation models like Chronos and TimesFM, which deliver competitive performance with minimal feature engineering, leveraging only the pre-trained model (zero-shot inference). Additionally, a hybrid PySpark-Pandas approach proves to be a robust solution for achieving horizontal scalability in large-scale deployments.
时间序列预测对于酒店行业的运营智能至关重要,而在大规模分布式系统中实现这一点尤其具有挑战性。本研究评估了统计方法、机器学习(ML)、深度学习以及基础模型在使用德国数千家餐厅的真实世界数据来预测14天内每小时销售额方面的表现。该预测解决方案包括天气条件、日历事件和时间段模式等特征。 实验结果表明,基于机器学习的元模型表现出色,并强调了Chronos和TimesFM等基础模型的新兴潜力,这些模型在无需复杂特征工程的情况下仅通过预训练模型(零样本推理)即可提供具有竞争力的表现。此外,混合使用PySpark和Pandas的方法被证明是实现大规模部署横向扩展的一种稳健解决方案。
https://arxiv.org/abs/2502.03395
We give a comprehensive analysis of transformers as time series foundation models, focusing on their approximation and generalization capabilities. First, we demonstrate that there exist transformers that fit an autoregressive model on input univariate time series via gradient descent. We then analyze MOIRAI, a multivariate time series foundation model capable of handling an arbitrary number of covariates. We prove that it is capable of automatically fitting autoregressive models with an arbitrary number of covariates, offering insights into its design and empirical success. For generalization, we establish bounds for pretraining when the data satisfies Dobrushin's condition. Experiments support our theoretical findings, highlighting the efficacy of transformers as time series foundation models.
我们对变压器作为时间序列基础模型进行了全面分析,重点关注它们的近似和泛化能力。首先,我们证明存在可以通过梯度下降在输入的一维时间序列上拟合自回归模型的变压器。然后,我们研究了MOIRAI,这是一种可以处理任意数量协变量的多变量时间序列基础模型,并证明它可以自动适应具有任意数量协变量的自回归模型,从而提供对其设计和实证成功的见解。对于泛化能力,我们在数据满足Dobrushin条件的情况下建立了预训练的界限。实验结果支持了我们的理论发现,突显了变压器作为时间序列基础模型的有效性。
https://arxiv.org/abs/2502.03383
Universal time series representation learning is challenging but valuable in real-world applications such as classification, anomaly detection, and forecasting. Recently, contrastive learning (CL) has been actively explored to tackle time series representation. However, a key challenge is that the data augmentation process in CL can distort seasonal patterns or temporal dependencies, inevitably leading to a loss of semantic information. To address this challenge, we propose Topological Contrastive Learning for time series (TopoCL). TopoCL mitigates such information loss by incorporating persistent homology, which captures the topological characteristics of data that remain invariant under transformations. In this paper, we treat the temporal and topological properties of time series data as distinct modalities. Specifically, we compute persistent homology to construct topological features of time series data, representing them in persistence diagrams. We then design a neural network to encode these persistent diagrams. Our approach jointly optimizes CL within the time modality and time-topology correspondence, promoting a comprehensive understanding of both temporal semantics and topological properties of time series. We conduct extensive experiments on four downstream tasks-classification, anomaly detection, forecasting, and transfer learning. The results demonstrate that TopoCL achieves state-of-the-art performance.
通用时间序列表示学习在实际应用(如分类、异常检测和预测)中具有挑战性但极具价值。最近,对比学习(CL)被积极研究用于解决时间序列的表示问题。然而,一个关键挑战是CL中的数据增强过程可能会扭曲季节模式或时间依赖关系,从而不可避免地导致语义信息丢失。为了解决这一挑战,我们提出了适用于时间序列的拓扑对比学习(TopoCL)。TopoCL通过引入持久同调来缓解这种信息损失,这种方法能够捕捉在变换下不变的数据拓扑特征。 在这篇论文中,我们将时间序列数据的时间和拓扑属性视为不同的模式。具体而言,我们利用持久同调计算时间序列数据的拓扑特性,并将它们表示为持久图。然后,我们设计了一个神经网络来编码这些持久图。我们的方法在时间模态内优化CL,并同时优化时间和拓扑之间的对应关系,从而促进对时间序列的时间语义和拓扑属性的全面理解。 我们在四个下游任务(分类、异常检测、预测和迁移学习)上进行了广泛的实验。结果表明,TopoCL达到了最先进的性能水平。
https://arxiv.org/abs/2502.02924
Recently, learning effective representations of urban regions has gained significant attention as a key approach to understanding urban dynamics and advancing smarter cities. Existing approaches have demonstrated the potential of leveraging mobility data to generate latent representations, providing valuable insights into the intrinsic characteristics of urban areas. However, incorporating the temporal dynamics and detailed semantics inherent in human mobility patterns remains underexplored. To address this gap, we propose a novel urban region representation learning model, Mobility Time Series Contrastive Learning for Urban Region Representations (MobiCLR), designed to capture semantically meaningful embeddings from inflow and outflow mobility patterns. MobiCLR uses contrastive learning to enhance the discriminative power of its representations, applying an instance-wise contrastive loss to capture distinct flow-specific characteristics. Additionally, we develop a regularizer to align output features with these flow-specific representations, enabling a more comprehensive understanding of mobility dynamics. To validate our model, we conduct extensive experiments in Chicago, New York, and Washington, D.C. to predict income, educational attainment, and social vulnerability. The results demonstrate that our model outperforms state-of-the-art models.
最近,学习城市区域的有效表示方法受到了广泛关注,被视为理解城市动态并推动智慧城市发展的重要途径。现有的研究方法展示了利用移动数据生成潜在表示的潜力,为了解城市地区的内在特征提供了宝贵的见解。然而,将人类移动模式中固有的时间动态和详细语义纳入考虑仍是一个未被充分探索的领域。 为了填补这一空白,我们提出了一种新型的城市区域表示学习模型——MobiCLR(Mobility Time Series Contrastive Learning for Urban Region Representations),该模型旨在从流入和流出的移动模式中捕捉具有语义意义的嵌入。MobiCLR采用对比学习来增强其表示的区分能力,并应用实例级对比损失以捕获特定流动的独特特性。此外,我们开发了一种正则化器来使输出特征与这些特定于流动的表示相一致,从而实现对移动动态更全面的理解。 为了验证我们的模型效果,我们在芝加哥、纽约和华盛顿特区进行了广泛实验,预测收入水平、教育程度和社会脆弱性。结果表明,我们的模型优于现有的最先进的模型。
https://arxiv.org/abs/2502.02912
Natural language interaction with sensing systems is crucial for enabling all users to comprehend sensor data and its impact on their everyday lives. However, existing systems, which typically operate in a Question Answering (QA) manner, are significantly limited in terms of the duration and complexity of sensor data they can handle. In this work, we introduce SensorChat, the first end-to-end QA system designed for long-term sensor monitoring with multimodal and high-dimensional data including time series. SensorChat effectively answers both qualitative (requiring high-level reasoning) and quantitative (requiring accurate responses derived from sensor data) questions in real-world scenarios. To achieve this, SensorChat uses an innovative three-stage pipeline that includes question decomposition, sensor data query, and answer assembly. The first and third stages leverage Large Language Models (LLMs) for intuitive human interactions and to guide the sensor data query process. Unlike existing multimodal LLMs, SensorChat incorporates an explicit query stage to precisely extract factual information from long-duration sensor data. We implement SensorChat and demonstrate its capability for real-time interactions on a cloud server while also being able to run entirely on edge platforms after quantization. Comprehensive QA evaluations show that SensorChat achieves up to 26% higher answer accuracy than state-of-the-art systems on quantitative questions. Additionally, a user study with eight volunteers highlights SensorChat's effectiveness in handling qualitative and open-ended questions.
与感知系统进行自然语言交互对于使所有用户能够理解传感器数据及其对日常生活的影响至关重要。然而,现有的大多数基于问题回答(QA)方式的系统,在处理传感器数据的时间跨度和复杂性方面存在显著限制。在这项工作中,我们引入了SensorChat——首个专为长期多模态高维度传感数据分析设计的端到端问答系统,其中包括时间序列数据。SensorChat能够有效解答现实场景中的定性和定量问题,前者要求高层次推理,后者则需要从传感器数据中准确提取信息。 为了实现这一目标,SensorChat采用了一个创新性的三阶段管道流程:问题分解、传感器数据查询和答案组装。第一和第三阶段利用大型语言模型(LLMs)来促进直观的人机交互,并指导传感器数据查询过程。与现有的多模态LLM不同,SensorChat包含一个明确的查询阶段,能够精确地从长时间跨度的传感器数据中提取事实信息。 我们实施了SensorChat,并展示了其在云端服务器上的实时互动能力,同时也证明它经过量化处理后可在边缘平台独立运行。全面的问题回答评估表明,在定量问题上,SensorChat比最先进的系统高出26%的答案准确性。此外,一项针对八名志愿者的研究突显了SensorChat在处理定性和开放性问题方面的有效性。
https://arxiv.org/abs/2502.02883
In the last decade, the rapid development of deep learning (DL) has made it possible to perform automatic, accurate, and robust Change Detection (CD) on large volumes of Remote Sensing Images (RSIs). However, despite advances in CD methods, their practical application in real-world contexts remains limited due to the diverse input data and the applicational context. For example, the collected RSIs can be time-series observations, and more informative results are required to indicate the time of change or the specific change category. Moreover, training a Deep Neural Network (DNN) requires a massive amount of training samples, whereas in many cases these samples are difficult to collect. To address these challenges, various specific CD methods have been developed considering different application scenarios and training resources. Additionally, recent advancements in image generation, self-supervision, and visual foundation models (VFMs) have opened up new approaches to address the 'data-hungry' issue of DL-based CD. The development of these methods in broader application scenarios requires further investigation and discussion. Therefore, this article summarizes the literature methods for different CD tasks and the available strategies and techniques to train and deploy DL-based CD methods in sample-limited scenarios. We expect that this survey can provide new insights and inspiration for researchers in this field to develop more effective CD methods that can be applied in a wider range of contexts.
在过去十年中,深度学习(DL)的快速发展使得在大量遥感图像(RSIs)上实现自动、准确和鲁棒的变化检测(CD)成为可能。然而,尽管变化检测方法有所进步,但由于输入数据的多样性和应用场景的不同,这些方法的实际应用仍然受到限制。例如,收集到的RSIs可以是时间序列观测,需要更详细的结果来指示变化发生的时间或具体变化类别。此外,训练深度神经网络(DNN)需要大量的训练样本,在许多情况下获取这些样本非常困难。为了应对这些挑战,根据不同应用场景和训练资源开发了各种特定的变化检测方法。此外,图像生成、自监督学习以及视觉基础模型(VFMs)的最新进展为解决基于DL的变化检测所面临的“数据饥渴”问题提供了新的途径。在更广泛的应用场景中发展这些方法需要进一步的研究和讨论。因此,本文总结了不同变化检测任务中的文献方法及可用策略和技术,以训练和部署样本有限情况下的DL基线变化检测方法。我们希望这项调查能够为该领域的研究人员提供新的见解和灵感,开发出更多有效的变化检测方法,并能够在更广泛的应用背景下应用这些方法。
https://arxiv.org/abs/2502.02835
We present a novel prompt design for Large Language Models (LLMs) tailored to Asynchronous Time Series. Unlike regular time series, which assume values at evenly spaced time points, asynchronous time series consist of timestamped events occurring at irregular intervals, each described in natural language. Our approach effectively utilizes the rich natural language of event descriptions, allowing LLMs to benefit from their broad world knowledge for reasoning across different domains and tasks. This allows us to extend the scope of asynchronous time series analysis beyond forecasting to include tasks like anomaly detection and data imputation. We further introduce Stochastic Soft Prompting, a novel prompt-tuning mechanism that significantly improves model performance, outperforming existing fine-tuning methods such as QLoRA. Through extensive experiments on real world datasets, we demonstrate that our approach achieves state-of-the-art performance across different tasks and datasets.
我们提出了一种新颖的提示设计,专门针对大型语言模型(LLMs)处理异步时间序列数据。与常规的时间序列不同,后者假设在等间隔的时间点上有数值存在,而异步时间序列则由一系列不规则发生的带时间戳事件组成,每个事件都用自然语言描述。我们的方法充分利用了事件描述中的丰富自然语言,使大型语言模型能够利用其广泛的常识知识来进行跨领域和任务的推理。这使得我们能够在异步时间序列分析中不仅包括预测功能,还能扩展到异常检测和数据插补等其他任务。 此外,我们引入了一种新的提示调优机制——随机软提示(Stochastic Soft Prompting),该方法显著提高了模型性能,并在诸如QLoRA之类的现有微调方法上取得了更好的效果。通过在实际世界数据集上的广泛实验,我们证明了我们的方法在不同的任务和数据集中达到了最先进的性能水平。
https://arxiv.org/abs/2502.01922
With climate change expected to exacerbate fire weather conditions, the accurate and timely anticipation of wildfires becomes increasingly crucial for disaster mitigation. In this study, we utilize SeasFire, a comprehensive global wildfire dataset with climate, vegetation, oceanic indices, and human-related variables, to enable seasonal wildfire forecasting with machine learning. For the predictive analysis, we present FireCastNet, a novel architecture which combines a 3D convolutional encoder with GraphCast, originally developed for global short-term weather forecasting using graph neural networks. FireCastNet is trained to capture the context leading to wildfires, at different spatial and temporal scales. Our investigation focuses on assessing the effectiveness of our model in predicting the presence of burned areas at varying forecasting time horizons globally, extending up to six months into the future, and on how different spatial or/and temporal context affects the performance. Our findings demonstrate the potential of deep learning models in seasonal fire forecasting; longer input time-series leads to more robust predictions, while integrating spatial information to capture wildfire spatio-temporal dynamics boosts performance. Finally, our results hint that in order to enhance performance at longer forecasting horizons, a larger receptive field spatially needs to be considered.
随着气候变化预计会加剧火灾天气条件,准确及时地预测野火变得越来越重要,以减轻灾害影响。在本研究中,我们利用SeasFire这一包含气候、植被、海洋指数和人类相关变量的全球野火数据集,通过机器学习实现季节性野火预报。对于预测分析,我们提出了FireCastNet,一种结合了3D卷积编码器与GraphCast(最初用于使用图神经网络进行全球短期天气预报)的新架构。FireCastNet被训练以捕捉导致不同空间和时间尺度上野火发生的背景情况。 我们的研究重点在于评估模型在全球范围内预测不同预报时间段内的烧毁区域存在的有效性,这些时间段可以延长到六个月之后,并探讨不同的空间或/及时间背景如何影响性能。我们的发现展示了深度学习模型在季节性火灾预报中的潜力;较长的输入时间序列会导致更稳健的预测,而整合空间信息以捕捉野火的空间和时间动态则能提升表现。最后,我们的结果显示为了提高长时间段预报的效果,需要考虑更大的空间接受范围。
https://arxiv.org/abs/2502.01550
Understanding time series data is crucial for multiple real-world applications. While large language models (LLMs) show promise in time series tasks, current approaches often rely on numerical data alone, overlooking the multimodal nature of time-dependent information, such as textual descriptions, visual data, and audio signals. Moreover, these methods underutilize LLMs' reasoning capabilities, limiting the analysis to surface-level interpretations instead of deeper temporal and multimodal reasoning. In this position paper, we argue that multimodal LLMs (MLLMs) can enable more powerful and flexible reasoning for time series analysis, enhancing decision-making and real-world applications. We call on researchers and practitioners to leverage this potential by developing strategies that prioritize trust, interpretability, and robust reasoning in MLLMs. Lastly, we highlight key research directions, including novel reasoning paradigms, architectural innovations, and domain-specific applications, to advance time series reasoning with MLLMs.
理解时间序列数据对于许多现实世界的应用至关重要。虽然大型语言模型(LLM)在时间序列任务中展现出潜力,但目前的方法往往仅依赖于数值数据,忽略了时间依赖信息的多模态特性,如文本描述、视觉数据和音频信号。此外,这些方法未能充分利用LLM的推理能力,限制了分析停留在表面层次的理解上,而不是进行更深层次的时间性和多模态推理。在这篇立场论文中,我们主张多模态大型语言模型(MLLM)可以为时间序列分析提供更强大和灵活的推理能力,从而增强决策制定和现实世界的应用效果。我们呼吁研究人员和从业者通过开发策略来利用这种潜力,这些策略优先考虑信任、可解释性和稳健性推理在MLLM中的应用。最后,我们强调了几个关键的研究方向,包括新颖的推理范式、架构创新以及特定领域的应用,以推动使用MLLM进行时间序列推理的发展。
https://arxiv.org/abs/2502.01477
Multimodal fusion leverages information across modalities to learn better feature representations with the goal of improving performance in fusion-based tasks. However, multimodal datasets, especially in medical settings, are typically smaller than their unimodal counterparts, which can impede the performance of multimodal models. Additionally, the increase in the number of modalities is often associated with an overall increase in the size of the multimodal network, which may be undesirable in medical use cases. Utilizing smaller unimodal encoders may lead to sub-optimal performance, particularly when dealing with high-dimensional clinical data. In this paper, we propose the Modality-INformed knowledge Distillation (MIND) framework, a multimodal model compression approach based on knowledge distillation that transfers knowledge from ensembles of pre-trained deep neural networks of varying sizes into a smaller multimodal student. The teacher models consist of unimodal networks, allowing the student to learn from diverse representations. MIND employs multi-head joint fusion models, as opposed to single-head models, enabling the use of unimodal encoders in the case of unimodal samples without requiring imputation or masking of absent modalities. As a result, MIND generates an optimized multimodal model, enhancing both multimodal and unimodal representations. It can also be leveraged to balance multimodal learning during training. We evaluate MIND on binary and multilabel clinical prediction tasks using time series data and chest X-ray images. Additionally, we assess the generalizability of the MIND framework on three non-medical multimodal multiclass datasets. Experimental results demonstrate that MIND enhances the performance of the smaller multimodal network across all five tasks, as well as various fusion methods and multimodal architectures, compared to state-of-the-art baselines.
多模态融合通过跨模式的信息利用来学习更好的特征表示,旨在提升基于融合的任务性能。然而,在医疗环境中,多模态数据集通常比单一模态的数据集小得多,这可能会阻碍多模态模型的性能。此外,增加模态的数量往往伴随着整个多模态网络规模的整体增大,在医学应用场景中这种现象可能是不希望看到的。使用较小的单模态编码器可能导致次优表现,特别是在处理高维临床数据时。为此,本文提出了基于知识蒸馏的多模态模型压缩方法——Modality-INformed 知识 Distillation(MIND)框架。该框架将不同大小的预训练深度神经网络集成的知识转移到较小的多模态学生模型中。教师模型由单模态网络组成,允许学生从多样化的表示形式中学习。与单一头部模型不同,MIND采用多头联合融合模型,在处理单模态样本时可以使用单模态编码器而无需对缺失模态进行填充或掩码操作。因此,MIND能够生成优化的多模态模型,同时增强多模态和单模态表示,并可用于在训练过程中平衡多模态学习。 我们通过使用时间序列数据和胸部X光图像上的二元分类任务及多标签临床预测任务评估了MIND框架的表现;此外还通过对三个非医疗领域的多模态多类别的数据集进行了泛化能力的测试。实验结果显示,与最新的基准方法相比,MIND在所有五个任务中均能提升较小的多模态网络的性能,并且对于各种融合方式和多模态架构均有显著改进。 通过这一描述可以看出,MIND框架旨在解决医疗领域中存在的多模态数据集相对较小、模型尺寸增大等问题,同时提供了一种有效的知识转移方法来优化多模态模型,从而在保持高效的同时提高其预测性能。
https://arxiv.org/abs/2502.01158
This paper presents the use of Kolmogorov-Arnold Networks (KANs) for forecasting the CBOE Volatility Index (VIX). Unlike traditional MLP-based neural networks that are often criticized for their black-box nature, KAN offers an interpretable approach via learnable spline-based activation functions and symbolification. Based on a parsimonious architecture with symbolic functions, KAN expresses a forecast of the VIX as a closed-form in terms of explanatory variables, and provide interpretable insights into key characteristics of the VIX, including mean reversion and the leverage effect. Through in-depth empirical analysis across multiple datasets and periods, we show that KANs achieve competitive forecasting performance while requiring significantly fewer parameters compared to MLP-based neural network models. Our findings demonstrate the capacity and potential of KAN as an interpretable financial time-series forecasting method.
本文介绍了使用Kolmogorov-Arnold Networks(KAN)来预测芝加哥期权交易所波动率指数(VIX)。与传统基于多层感知机(MLP)的神经网络因其黑箱特性而常遭批评不同,KAN通过可学习的样条基激活函数和符号化提供了一种可解释的方法。基于简约架构及象征性功能,KAN将VIX预测表达为解释变量的闭式形式,并提供了关于VIX关键特性的可解释见解,包括均值回归和杠杆效应。通过对多个数据集和时期进行深入实证分析,我们证明了KAN在需要比MLP基神经网络模型显著更少参数的情况下仍能达到竞争性的预测性能。我们的研究结果展示了KAN作为可解释金融时间序列预测方法的能力与潜力。
https://arxiv.org/abs/2502.00980
The analysis of wearable sensor data has enabled many successes in several applications. To represent the high-sampling rate time-series with sufficient detail, the use of topological data analysis (TDA) has been considered, and it is found that TDA can complement other time-series features. Nonetheless, due to the large time consumption and high computational resource requirements of extracting topological features through TDA, it is difficult to deploy topological knowledge in various applications. To tackle this problem, knowledge distillation (KD) can be adopted, which is a technique facilitating model compression and transfer learning to generate a smaller model by transferring knowledge from a larger network. By leveraging multiple teachers in KD, both time-series and topological features can be transferred, and finally, a superior student using only time-series data is distilled. On the other hand, mixup has been popularly used as a robust data augmentation technique to enhance model performance during training. Mixup and KD employ similar learning strategies. In KD, the student model learns from the smoothed distribution generated by the teacher model, while mixup creates smoothed labels by blending two labels. Hence, this common smoothness serves as the connecting link that establishes a connection between these two methods. In this paper, we analyze the role of mixup in KD with time-series as well as topological persistence, employing multiple teachers. We present a comprehensive analysis of various methods in KD and mixup on wearable sensor data.
可穿戴传感器数据的分析已在多个应用领域取得了许多成功。为了用足够的细节表示高采样率的时间序列,人们考虑使用拓扑数据分析(TDA),并发现TDA可以补充其他时间序列特征。然而,由于通过TDA提取拓扑特征所需的大时间和计算资源消耗,难以将这种知识应用于各种应用中。为了解决这个问题,可以采用知识蒸馏(KD),这是一种技术,通过模型压缩和迁移学习从较大的网络中转移知识来生成较小的模型。借助KD中的多个教师,可以同时传递时间序列和拓扑特征,最终蒸馏出一个仅使用时间序列数据的优秀学生模型。 另一方面,mixup(一种数据增强技术)已被广泛用于在训练期间通过创建混合标签来平滑标签以提高模型性能。KD与mixup采用了类似的学习策略:在KD中,学生模型从教师模型生成的平滑分布中学习;而mixup则是通过混合两个标签来创造平滑标签。因此,这种共同的平滑性作为连接这两种方法之间的桥梁。 在这篇论文中,我们分析了在使用时间序列以及拓扑持久性的知识蒸馏(KD)过程中多老师设置下的mixup的作用。我们对多种不同方法进行了全面的分析,包括穿戴式传感器数据的知识蒸馏和mixup技术的应用效果。
https://arxiv.org/abs/2502.00779
Time Series Classification (TSC) is highly vulnerable to backdoor attacks, posing significant security threats. Existing methods primarily focus on data poisoning during the training phase, designing sophisticated triggers to improve stealthiness and attack success rate (ASR). However, in practical scenarios, attackers often face restrictions in accessing training data. Moreover, it is a challenge for the model to maintain generalization ability on clean test data while remaining vulnerable to poisoned inputs when data is inaccessible. To address these challenges, we propose TrojanTime, a novel two-step training algorithm. In the first stage, we generate a pseudo-dataset using an external arbitrary dataset through target adversarial attacks. The clean model is then continually trained on this pseudo-dataset and its poisoned version. To ensure generalization ability, the second stage employs a carefully designed training strategy, combining logits alignment and batch norm freezing. We evaluate TrojanTime using five types of triggers across four TSC architectures in UCR benchmark datasets from diverse domains. The results demonstrate the effectiveness of TrojanTime in executing backdoor attacks while maintaining clean accuracy. Finally, to mitigate this threat, we propose a defensive unlearning strategy that effectively reduces the ASR while preserving clean accuracy.
时间序列分类(TSC)高度易受后门攻击的影响,从而带来了严重的安全威胁。现有的方法主要集中在训练阶段的数据投毒上,设计复杂的触发器以提高隐蔽性和攻击成功率(ASR)。然而,在实际场景中,攻击者往往面临无法访问训练数据的限制。此外,当数据不可用时,模型在保持对干净测试数据泛化能力的同时仍对被污染输入保持脆弱性也是一项挑战。 为了解决这些难题,我们提出了一种名为TrojanTime的新颖两阶段训练算法。第一阶段使用外部任意数据集通过目标对抗攻击生成伪数据集,然后在这个伪数据集及其被投毒版本上持续训练干净模型。为了确保泛化能力,在第二阶段采用精心设计的训练策略,结合对数项对齐和批量归一化冻结技术。 我们利用UCR基准数据集中四种不同的TSC架构和五种触发类型评估了TrojanTime的效果。实验结果表明,TrojanTime在保持干净准确率的同时能够有效地执行后门攻击。最后,为了缓解这种威胁,我们提出了一种防御性卸载策略,该策略可以在不损害干净准确性的前提下有效降低ASR。
https://arxiv.org/abs/2502.00646
How can we identify groups of primate individuals which could be conjectured to drive social structure? To address this question, one of us has collected a time series of data for social interactions between chimpanzees. Here we use a network representation, leading to the task of combining these data into a time series of a single weighted network per time stamp, where different proximities should be given different weights reflecting their relative importance. We optimize these proximity-type weights in a principled way, using an innovative loss function which rewards structural consistency across time. The approach is empirically validated by carefully designed synthetic data. Using statistical tests, we provide a way of identifying groups of individuals that stay related for a significant length of time. Applying the approach to the chimpanzee data set, we detect cliques in the animal social network time series, which can be validated by real-world intuition from prior research and qualitative observations by chimpanzee experts.
我们如何识别可能驱动社会结构的灵长类个体群体?为了回答这个问题,研究人员之一收集了黑猩猩之间社会互动的时间序列数据。在此研究中,我们将这些数据以网络表示形式进行分析,并将它们组合成每个时间点上的单个加权网络时间序列,在这个过程中需要根据不同接近程度赋予不同的权重,反映其相对重要性。我们通过一种创新的损失函数对这种接近类型权重进行了原理性的优化处理,该损失函数奖励结构随时间的一致性。这种方法使用精心设计的合成数据集得到了实证验证。通过统计测试,我们提出了一种方法来识别在一段时间内保持相关关系的个体群体。将此方法应用于黑猩猩的数据集中后,我们在动物社会网络的时间序列中检测到了一些紧密联系的小团体,这些结果可以通过先前研究中的现实世界直觉和黑猩猩专家的定性观察来进行验证。
https://arxiv.org/abs/2502.00302
Multivariate Time Series Imputation (MTSI) is crucial for many applications, such as healthcare monitoring and traffic management, where incomplete data can compromise decision-making. Existing state-of-the-art methods, like Denoising Diffusion Probabilistic Models (DDPMs), achieve high imputation accuracy; however, they suffer from significant computational costs and are notably time-consuming due to their iterative nature. In this work, we propose CoSTI, an innovative adaptation of Consistency Models (CMs) for the MTSI domain. CoSTI employs Consistency Training to achieve comparable imputation quality to DDPMs while drastically reducing inference times, making it more suitable for real-time applications. We evaluate CoSTI across multiple datasets and missing data scenarios, demonstrating up to a 98% reduction in imputation time with performance on par with diffusion-based models. This work bridges the gap between efficiency and accuracy in generative imputation tasks, providing a scalable solution for handling missing data in critical spatio-temporal systems.
多元时间序列插补(MTSI)对于许多应用至关重要,例如健康监测和交通管理,在这些领域中,不完整数据会损害决策制定。现有的最先进方法,如去噪扩散概率模型(DDPM),实现了高精度的插补;然而,由于其迭代性质,它们面临显著的计算成本并且耗时较长。在这项工作中,我们提出了一种名为CoSTI的新颖方法,它是将一致性模型(CMs)适应于MTSI领域的创新应用。CoSTI利用一致性训练来实现与DDPM相当的插补质量,同时大幅减少推断时间,使其更适合实时应用。我们在多个数据集和缺失数据场景下评估了CoSTI,证明在性能上可以达到基于扩散模型的表现水平的同时,插补时间减少了多达98%。这项工作弥合了生成性插补任务中的效率与准确性之间的差距,为处理关键时空系统中的缺失数据提供了可扩展的解决方案。
https://arxiv.org/abs/2501.19364
With the rise in global greenhouse gas emissions, accurate large-scale tree canopy height maps are essential for understanding forest structure, estimating above-ground biomass, and monitoring ecological disruptions. To this end, we present a novel approach to generate large-scale, high-resolution canopy height maps over time. Our model accurately predicts canopy height over multiple years given Sentinel-2 time series satellite data. Using GEDI LiDAR data as the ground truth for training the model, we present the first 10m resolution temporal canopy height map of the European continent for the period 2019-2022. As part of this product, we also offer a detailed canopy height map for 2020, providing more precise estimates than previous studies. Our pipeline and the resulting temporal height map are publicly available, enabling comprehensive large-scale monitoring of forests and, hence, facilitating future research and ecological analyses. For an interactive viewer, see this https URL.
随着全球温室气体排放的增加,准确的大规模树冠高度地图对于理解森林结构、估算地上生物量以及监测生态干扰至关重要。为此,我们提出了一种新颖的方法,利用Sentinel-2时间序列卫星数据生成大规模、高分辨率的时间树冠高度图。我们的模型能够根据多期Sentinel-2卫星数据准确预测多年来的树冠高度变化。通过使用GEDI激光雷达数据作为训练模型的地面真实情况,我们首次发布了欧洲大陆2019年至2022年期间的第一张10米分辨率时间树冠高度地图。此外,该产品还包括一份详细的2020年树冠高度图,其提供的估算精度超过了以往的研究成果。我们的工作流程和生成的时间高度图已公开发布,这将有助于大规模森林的全面监测,并为未来研究和生态分析提供支持。要查看交互式视图,请访问此[URL](请在原文中替换具体的链接)。
https://arxiv.org/abs/2501.19328
Time-series forecasting is crucial for numerous real-world applications including weather prediction and financial market modeling. While temporal-domain methods remain prevalent, frequency-domain approaches can effectively capture multi-scale periodic patterns, reduce sequence dependencies, and naturally denoise signals. However, existing approaches typically train model components for all frequencies under a unified training objective, often leading to mismatched learning speeds: high-frequency components converge faster and risk overfitting, while low-frequency components underfit due to insufficient training time. To deal with this challenge, we propose BEAT (Balanced frEquency Adaptive Tuning), a novel framework that dynamically monitors the training status for each frequency and adaptively adjusts their gradient updates. By recognizing convergence, overfitting, or underfitting for each frequency, BEAT dynamically reallocates learning priorities, moderating gradients for rapid learners and increasing those for slower ones, alleviating the tension between competing objectives across frequencies and synchronizing the overall learning process. Extensive experiments on seven real-world datasets demonstrate that BEAT consistently outperforms state-of-the-art approaches.
时间序列预测在天气预报和金融市场建模等众多实际应用中至关重要。尽管基于时间域的方法仍然很流行,但频域方法能够有效地捕捉多尺度周期模式、减少序列依赖,并自然地去除信号噪声。然而,现有方法通常会将所有频率下的模型组件统一在一个训练目标下进行训练,这往往会导致学习速度不匹配:高频成分收敛速度快,存在过拟合的风险;而低频成分由于训练时间不足,容易出现欠拟合的情况。 为了解决这一挑战,我们提出了BEAT(Balanced frEquency Adaptive Tuning),这是一种新的框架,可以动态地监控每个频率的训练状态,并根据需要调整它们的梯度更新。通过识别每个频率的收敛、过拟合或欠拟合情况,BEAT能够动态分配学习优先级:减缓快速学习者的梯度并增加慢速学习者的梯度,从而缓解不同频段目标之间的竞争张力,并同步整体的学习过程。 在七个实际数据集上的大量实验表明,BEAT始终优于最先进的方法。
https://arxiv.org/abs/2501.19065
The development of image time series retrieval (ITSR) methods is a growing research interest in remote sensing (RS). Given a user-defined image time series (i.e., the query time series), the ITSR methods search and retrieve from large archives the image time series that have similar content to the query time series. The existing ITSR methods in RS are designed for unimodal retrieval problems, limiting their usability and versatility. To overcome this issue, as a first time in RS we introduce the task of cross-modal text-ITSR. In particular, we present a self-supervised cross-modal text-image time series retrieval (text-ITSR) method that enables the retrieval of image time series using text sentences as queries, and vice versa. In detail, we focus our attention on text-ITSR in pairs of images (i.e., bitemporal images). The proposed text-ITSR method consists of two key components: 1) modality-specific encoders to model the semantic content of bitemporal images and text sentences with discriminative features; and 2) modality-specific projection heads to align textual and image representations in a shared embedding space. To effectively model the temporal information within the bitemporal images, we introduce two fusion strategies: i) global feature fusion (GFF) strategy that combines global image features through simple yet effective operators; and ii) transformer-based feature fusion (TFF) strategy that leverages transformers for fine-grained temporal integration. Extensive experiments conducted on two benchmark RS archives demonstrate the effectiveness of the proposed method in accurately retrieving semantically relevant bitemporal images (or text sentences) to a query text sentence (or bitemporal image). The code of this work is publicly available at this https URL.
在遥感(RS)领域,图像时间序列检索(ITSR)方法的发展已成为一个日益增长的研究兴趣。给定用户定义的图像时间序列(即查询时间序列),ITSR 方法会在大型档案中搜索并检索与查询时间序列内容相似的图像时间序列。现有的 RS 中 ITSR 方法仅针对单模态检索问题设计,限制了其可用性和灵活性。为解决这一问题,在 RS 领域首次引入跨模态文本-ITSR 任务。具体而言,我们提出了一种自监督跨模态文本-图像时间序列检索(text-ITSR)方法,该方法能够使用文本句子作为查询来检索图像时间序列,并且可以反过来将图像时间序列转换为文本进行描述。 我们的工作特别关注于图像对中的文本-ITSR(即双时相图像)。所提出的 text-ITSR 方法包括两个关键组件:1) 用于建模双时相图像和文本句子的语义内容以具有判别性特征的模态特定编码器;2) 将文本与图像表示在共享嵌入空间中对齐,以便于跨模态检索的模态特定投影头。为了有效建模双时相图像内的时间信息,我们引入了两种融合策略:i) 全局特征融合(GFF)策略通过简单而有效的操作来结合全局图像特征;ii) 基于变压器的特征融合(TFF)策略利用变换器进行细粒度的时间整合。 在两个基准 RS 存档上进行了广泛实验,结果表明所提出的方法能够准确检索与查询文本句子(或双时相图像)语义相关的双时相图像(或文本句子)。这项工作的代码可以公开获取,网址为 [此处提供URL]。
https://arxiv.org/abs/2501.19043
In this paper, we develop a neural network-based approach for time-series prediction in unknown Hamiltonian dynamical systems. Our approach leverages a surrogate model and learns the system dynamics using generalized coordinates (positions) and their conjugate momenta while preserving a constant Hamiltonian. To further enhance long-term prediction accuracy, we introduce an Autoregressive Hamiltonian Neural Network, which incorporates autoregressive prediction errors into the training objective. Additionally, we employ Bayesian data assimilation to refine predictions in real-time using online measurement data. Numerical experiments on a spring-mass system and highly elliptic orbits under gravitational perturbations demonstrate the effectiveness of the proposed method, highlighting its potential for accurate and robust long-term predictions.
在这篇论文中,我们开发了一种基于神经网络的方法,用于未知哈密顿动力系统的时序预测。我们的方法利用了替代模型,并通过广义坐标(位置)及其共轭动量来学习系统动态,同时保持恒定的哈密顿函数不变。为了进一步提高长期预测的准确性,我们引入了一种自回归哈密顿神经网络,该网络将自回归预测误差纳入训练目标中。此外,我们还采用了贝叶斯数据同化方法,在线利用测量数据实时改进预测。在弹簧-质量系统和受引力摄动影响的高度椭圆轨道上的数值实验展示了所提出方法的有效性,并突显了其进行准确且稳健的长期预测的巨大潜力。
https://arxiv.org/abs/2501.18808
Attribute manipulation deals with the problem of changing individual attributes of a data point or a time series, while leaving all other aspects unaffected. This work focuses on the domain of human motion, more precisely karate movement patterns. To the best of our knowledge, it presents the first success at manipulating attributes of human motion data. One of the key requirements for achieving attribute manipulation on human motion is a suitable pose representation. Therefore, we design a novel rotation-based pose representation that enables the disentanglement of the human skeleton and the motion trajectory, while still allowing an accurate reconstruction of the original anatomy. The core idea of the manipulation approach is to use a transformer encoder for discovering high-level semantics, and a diffusion probabilistic model for modeling the remaining stochastic variations. We show that the embedding space obtained from the transformer encoder is semantically meaningful and linear. This enables the manipulation of high-level attributes, by discovering their linear direction of change in the semantic embedding space and moving the embedding along said direction. The code and data are available at this https URL.
属性操控涉及改变数据点或时间序列中个别属性的问题,同时保持所有其他方面的不变。这项工作专注于人类运动领域,更具体地说是空手道动作模式。据我们所知,这是首次成功地对人类运动数据的属性进行操作。 在人体运动上实现属性操控的一个关键要求是合适的姿态表示方式。因此,我们设计了一种基于旋转的姿态表示方法,它能够分离出人体骨骼和运动轨迹,同时仍然允许准确重建原始解剖结构。我们的操控方法的核心理念是使用变压器编码器来发现高层次语义,并使用扩散概率模型来建模剩余的随机变化。 我们展示了从变压器编码器获得的嵌入空间具有语义意义且为线性。这使得能够通过对语义嵌入空间中属性更改方向的发现和沿该方向移动嵌入,来进行高层次属性的操作。 代码和数据可在以下链接获取:[此链接应替换为您提供的具体URL]。
https://arxiv.org/abs/2501.18729