Detecting critical transitions in complex, noisy time-series data is a fundamental challenge across science and engineering. Such transitions may be anticipated by the emergence of a low-dimensional order parameter, whose signature is often masked by high-amplitude stochastic variability. Standard contrastive learning approaches based on deep neural networks, while promising for detecting critical transitions, are often overparameterized and sensitive to irrelevant noise, leading to inaccurate identification of critical points. To address these limitations, we propose a neural network architecture, constructed using singular value decomposition technique, together with a strictly semi-orthogonality-constrained training algorithm, to enhance the performance of traditional contrastive learning. Extensive experiments demonstrate that the proposed method matches the performance of traditional contrastive learning techniques in identifying critical transitions, yet is considerably more lightweight and markedly more resistant to noise.
在复杂且噪声较多的时间序列数据中检测关键过渡状态是科学和工程领域面临的基本挑战。此类过渡可能通过一个低维序参量的出现被预测,然而该参数的特征通常会被高振幅的随机变异性掩盖。虽然基于深度神经网络的标准对比学习方法在检测临界点方面很有前景,但由于过度复杂化和对无关噪声的高度敏感性,往往会导致关键点识别不准确的问题。为解决这些限制,我们提出了一种使用奇异值分解技术构建的神经网络架构,并结合严格的半正交约束训练算法来改进传统的对比学习性能。大量的实验表明,所提出的这种方法在识别临界过渡方面与传统对比学习技术的表现相匹配,但在轻量级和抗噪声能力上要显著更强。
https://arxiv.org/abs/2512.12523
Explainable Artificial Intelligence (XAI) is increasingly required in computational economics, where machine-learning forecasters can outperform classical econometric models but remain difficult to audit and use for policy. This survey reviews and organizes the growing literature on XAI for economic time series, where autocorrelation, non-stationarity, seasonality, mixed frequencies, and regime shifts can make standard explanation techniques unreliable or economically implausible. We propose a taxonomy that classifies methods by (i) explanation mechanism: propagation-based approaches (e.g., Integrated Gradients, Layer-wise Relevance Propagation), perturbation and game-theoretic attribution (e.g., permutation importance, LIME, SHAP), and function-based global tools (e.g., Accumulated Local Effects); (ii) time-series compatibility, including preservation of temporal dependence, stability over time, and respect for data-generating constraints. We synthesize time-series-specific adaptations such as vector- and window-based formulations (e.g., Vector SHAP, WindowSHAP) that reduce lag fragmentation and computational cost while improving interpretability. We also connect explainability to causal inference and policy analysis through interventional attributions (Causal Shapley values) and constrained counterfactual reasoning. Finally, we discuss intrinsically interpretable architectures (notably attention-based transformers) and provide guidance for decision-grade applications such as nowcasting, stress testing, and regime monitoring, emphasizing attribution uncertainty and explanation dynamics as indicators of structural change.
可解释的人工智能(XAI)在计算经济学中日益受到重视,因为机器学习预测模型可以优于传统计量经济模型,但它们难以审核和用于政策制定。本文综述并整理了关于时间序列经济数据的XAI不断增长的相关文献。在这种背景下,自相关性、非平稳性、季节性变化、混合频率以及制度转换等因素使得标准解释技术变得不可靠或缺乏经济合理性。 我们提出了一种分类方法,根据(i)解释机制:基于传播的方法(例如集成梯度法和逐层相关传播),基于干扰的游戏理论归因方法(如排列重要性、LIME 和 SHAP),以及基于函数的全局工具;(ii)时间序列兼容性,包括保持时间依赖关系、时间稳定性以及尊重数据生成约束。我们综合了针对时间序列具体适应性的改进,例如向量和窗口基形式(如Vector SHAP 和 WindowSHAP),这些方法减少了滞后碎片化,并降低了计算成本,同时提高了可解释性。 此外,本文还探讨了解释性和因果推断及政策分析之间的联系,通过干预归因(因果 Shapley 值)和受限反事实推理进行连接。最后,我们讨论了本原具有解释性的架构(特别是基于注意力的转换器),并为决策级应用如即时预测、压力测试和制度监控提供了指导,并强调了归因不确定性以及解释动态性作为结构变化的指标。 这一综述旨在帮助经济学研究者更好地理解如何利用XAI技术来增强机器学习模型在时间序列分析中的透明度与可靠性,从而促进更有效的政策制定。
https://arxiv.org/abs/2512.12506
Real-time decoding of target variables from multiple simultaneously recorded neural time-series modalities, such as discrete spiking activity and continuous field potentials, is important across various neuroscience applications. However, a major challenge for doing so is that different neural modalities can have different timescales (i.e., sampling rates) and different probabilistic distributions, or can even be missing at some time-steps. Existing nonlinear models of multimodal neural activity do not address different timescales or missing samples across modalities. Further, some of these models do not allow for real-time decoding. Here, we develop a learning framework that can enable real-time recursive decoding while nonlinearly aggregating information across multiple modalities with different timescales and distributions and with missing samples. This framework consists of 1) a multiscale encoder that nonlinearly aggregates information after learning within-modality dynamics to handle different timescales and missing samples in real time, 2) a multiscale dynamical backbone that extracts multimodal temporal dynamics and enables real-time recursive decoding, and 3) modality-specific decoders to account for different probabilistic distributions across modalities. In both simulations and three distinct multiscale brain datasets, we show that our model can aggregate information across modalities with different timescales and distributions and missing samples to improve real-time target decoding. Further, our method outperforms various linear and nonlinear multimodal benchmarks in doing so.
从多个同时记录的神经时间序列模态(如离散尖峰活动和连续场电位)中实时解码目标变量,在各种神经科学应用中都很重要。然而,实现这一目标的一个主要挑战是不同的神经模态可能具有不同的时间尺度(即采样率)以及不同的概率分布,并且在某些时间段内可能缺失数据。现有的多模态神经活动的非线性模型没有解决跨模态的不同时间尺度或丢失样本的问题。此外,其中一些模型不支持实时解码。在这里,我们开发了一个学习框架,该框架能够在不同时间和分布的多模态信息以及具有丢失样本的情况下进行实时递归解码的同时非线性聚合信息。 这个框架包括: 1. 一个多尺度编码器,它在学会了处理同种模态的动力学之后,能够非线性地汇总信息来应对不同的时间尺度和丢失样本,并实现实时操作。 2. 一个多尺度动力学骨干结构,用于提取多模态的动态特性并支持实时递归解码。 3. 具体针对不同概率分布跨模态的解码器。 在模拟环境以及三个具有不同时间尺度的真实脑数据集中,我们展示了该模型能够在存在不同时间尺度、不同分布和缺失样本的情况下汇总信息,从而提高目标变量的实时解码精度。此外,在执行上述任务时,我们的方法优于各种线性和非线性多模态基准模型。
https://arxiv.org/abs/2512.12462
Accurate volatility forecasting is essential in banking, investment, and risk management, because expectations about future market movements directly influence current decisions. This study proposes a hybrid modelling framework that integrates a Stochastic Volatility model with a Long Short Term Memory neural network. The SV model improves statistical precision and captures latent volatility dynamics, especially in response to unforeseen events, while the LSTM network enhances the model's ability to detect complex nonlinear patterns in financial time series. The forecasting is conducted using daily data from the S and P 500 index, covering the period from January 1 1998 to December 31 2024. A rolling window approach is employed to train the model and generate one step ahead volatility forecasts. The performance of the hybrid SV-LSTM model is evaluated through both statistical testing and investment simulations. The results show that the hybrid approach outperforms both the standalone SV and LSTM models and contributes to the development of volatility modelling techniques, providing a foundation for improving risk assessment and strategic investment planning in the context of the S and P 500.
准确的波动率预测在银行业、投资和风险管理中至关重要,因为对未来市场走势的预期直接影响当前决策。本研究提出了一种混合建模框架,该框架结合了随机波动性(Stochastic Volatility, SV)模型与长短期记忆神经网络(Long Short Term Memory, LSTM)。SV 模型提高了统计精度,并能捕捉潜在的波动动态,尤其是在应对不可预见事件时;而LSTM 网络则增强了模型检测金融时间序列中复杂非线性模式的能力。预测使用的是从1998年1月1日到2024年12月31日期间的标普500指数的每日数据,采用滚动窗口方法训练模型并生成一步向前的波动率预测。通过统计测试和投资模拟评估混合SV-LSTM 模型的表现。结果显示,该混合方法优于单独使用的SV 和LSTM 模型,并为波动性建模技术的发展做出了贡献,从而为进一步改善标普500指数的风险评估与战略投资规划奠定了基础。
https://arxiv.org/abs/2512.12250
Monitoring forecasting systems is critical for customer satisfaction, profitability, and operational efficiency in large-scale retail businesses. We propose The Forecast Critic, a system that leverages Large Language Models (LLMs) for automated forecast monitoring, taking advantage of their broad world knowledge and strong ``reasoning'' capabilities. As a prerequisite for this, we systematically evaluate the ability of LLMs to assess time series forecast quality, focusing on three key questions. (1) Can LLMs be deployed to perform forecast monitoring and identify obviously unreasonable forecasts? (2) Can LLMs effectively incorporate unstructured exogenous features to assess what a reasonable forecast looks like? (3) How does performance vary across model sizes and reasoning capabilities, measured across state-of-the-art LLMs? We present three experiments, including on both synthetic and real-world forecasting data. Our results show that LLMs can reliably detect and critique poor forecasts, such as those plagued by temporal misalignment, trend inconsistencies, and spike errors. The best-performing model we evaluated achieves an F1 score of 0.88, somewhat below human-level performance (F1 score: 0.97). We also demonstrate that multi-modal LLMs can effectively incorporate unstructured contextual signals to refine their assessment of the forecast. Models correctly identify missing or spurious promotional spikes when provided with historical context about past promotions (F1 score: 0.84). Lastly, we demonstrate that these techniques succeed in identifying inaccurate forecasts on the real-world M5 time series dataset, with unreasonable forecasts having an sCRPS at least 10% higher than that of reasonable forecasts. These findings suggest that LLMs, even without domain-specific fine-tuning, may provide a viable and scalable option for automated forecast monitoring and evaluation.
监控预测系统对于大规模零售业务的客户满意度、盈利能力以及运营效率至关重要。我们提出了一个名为“The Forecast Critic”的系统,该系统利用大型语言模型(LLMs)进行自动化预测监测,充分发挥其广博的世界知识和强大的“推理”能力。为了实现这一目标,我们需要系统地评估LLM评估时间序列预测质量的能力,并重点关注三个关键问题:(1) LLM能否部署以执行预测监控并识别明显不合理的结果?(2) LLM是否能够有效整合未结构化的外部特征来判断合理的预测应该是什么样的?(3) 领先的LLMs在模型大小和推理能力方面的性能如何变化? 我们进行了三个实验,包括使用合成数据集和真实世界预测数据。我们的结果显示,LLM可以可靠地检测并批评质量差的预测,例如那些受时间错配、趋势不一致以及尖峰误差困扰的结果。我们评估的最佳表现模型获得了0.88的F1分数,略低于人类级别的性能(F1分数:0.97)。此外,我们还展示了多模态LLM能够有效整合未结构化的上下文信号来细化其对预测的评价。当提供有关过去促销活动的历史背景时,模型可以正确识别缺失或虚假的促销峰值(F1分数:0.84)。 最后,我们在真实世界的数据集——M5时间序列数据上展示了这些技术的有效性,该数据集中不合理预测的sCRPS至少比合理预测高出10%。这些发现表明,即使没有进行领域特定的微调,LLMs也可能为自动化预测监测和评估提供可行且可扩展的选择。 综上所述,“The Forecast Critic”系统能够利用大型语言模型的能力来改进零售业务中的预测质量监控,并在一定程度上实现了这一目标。
https://arxiv.org/abs/2512.12059
The development of foundation models for functional magnetic resonance imaging (fMRI) time series holds significant promise for predicting phenotypes related to disease and cognition. Current models, however, are often trained using a mask-and-reconstruct objective on small brain regions. This focus on low-level information leads to representations that are sensitive to noise and temporal fluctuations, necessitating extensive fine-tuning for downstream tasks. We introduce Brain-Semantoks, a self-supervised framework designed specifically to learn abstract representations of brain dynamics. Its architecture is built on two core innovations: a semantic tokenizer that aggregates noisy regional signals into robust tokens representing functional networks, and a self-distillation objective that enforces representational stability across time. We show that this objective is stabilized through a novel training curriculum, ensuring the model robustly learns meaningful features from low signal-to-noise time series. We demonstrate that learned representations enable strong performance on a variety of downstream tasks even when only using a linear probe. Furthermore, we provide comprehensive scaling analyses indicating more unlabeled data reliably results in out-of-distribution performance gains without domain adaptation.
用于功能磁共振成像(fMRI)时间序列的基础模型的发展在预测与疾病和认知相关的表型方面具有巨大的潜力。然而,目前的模型通常使用掩码重构目标对小脑区进行训练,这种侧重于低级信息的方法导致生成的表示非常容易受到噪声和时间波动的影响,需要大量的微调才能适用于下游任务。我们引入了Brain-Semantoks,这是一个专门用于学习大脑动态抽象表示的自监督框架。其架构基于两个核心创新:一种语义标记器,它将嘈杂区域信号聚合为表示功能网络的稳健令牌;以及一个自我蒸馏目标,强制执行跨时间的时间表示稳定性。我们展示了通过新颖的训练课程可以稳定这个目标,确保模型能够从低信噪比的时间序列中可靠地学习有意义的功能。 研究表明,所学得的表示即使在仅使用线性探针的情况下也能使多种下游任务表现出色。此外,还提供了全面的扩展分析,表明更多的未标记数据会可靠地导致出界分布性能提升,并且无需领域适应。 翻译如下: 功能磁共振成像(fMRI)时间序列的基础模型的发展在预测与疾病和认知相关的表型方面具有巨大的潜力。然而,目前的模型通常使用掩码重建目标对小脑区进行训练,这种侧重于低级信息的方法导致生成的表示非常容易受到噪声和时间波动的影响,需要大量的微调才能适用于下游任务。我们引入了Brain-Semantoks,这是一个专门用于学习大脑动态抽象表示的自监督框架。其架构基于两个核心创新:一种语义标记器,它将嘈杂区域信号聚合为表示功能网络的稳健令牌;以及一个自我蒸馏目标,强制执行跨时间的时间表示稳定性。 研究表明,通过新颖的训练课程可以稳定这一目标,确保模型能够从低信噪比的时间序列中可靠地学习有意义的功能。此外,所学得的表示即使在仅使用线性探针的情况下也能使多种下游任务表现出色。最后,我们还提供了全面的扩展分析,表明更多的未标记数据会可靠地导致出界分布性能提升,并且无需领域适应。
https://arxiv.org/abs/2512.11582
The calving fronts of marine-terminating glaciers undergo constant changes. These changes significantly affect the glacier's mass and dynamics, demanding continuous monitoring. To address this need, deep learning models were developed that can automatically delineate the calving front in Synthetic Aperture Radar imagery. However, these models often struggle to correctly classify areas affected by seasonal conditions such as ice melange or snow-covered surfaces. To address this issue, we propose to process multiple frames from a satellite image time series of the same glacier in parallel and exchange temporal information between the corresponding feature maps to stabilize each prediction. We integrate our approach into the current state-of-the-art architecture Tyrion and accomplish a new state-of-the-art performance on the CaFFe benchmark dataset. In particular, we achieve a Mean Distance Error of 184.4 m and a mean Intersection over Union of 83.6.
海洋终端冰川的崩解前沿持续发生变化,这些变化对冰川的质量和动态产生了显著影响,因此需要进行连续监测。为解决这一需求,开发了深度学习模型,可以自动在合成孔径雷达(SAR)图像中划定崩解前沿。然而,这些模型常常难以准确分类受季节条件如海冰混杂或雪覆盖表面影响的区域。 为了解决这个问题,我们建议同时处理来自同一冰川的时间序列卫星图像中的多个帧,并在相应的特征图之间交换时间信息以稳定每个预测结果。我们将这种方法整合到当前最先进的架构Tyrion中,在CaFFe基准数据集上实现了新的最佳性能。具体来说,我们在该数据集中达到了平均距离误差为184.4米和平均交并比(IoU)为83.6%的结果。
https://arxiv.org/abs/2512.11560
The standard paradigm for training deep learning models on sensor data assumes that more data is always better. However, raw sensor streams are often imbalanced and contain significant redundancy, meaning that not all data points contribute equally to model generalization. In this paper, we show that, in some cases, "less is more" when considering datasets. We do this by reframing the data selection problem: rather than tuning model hyperparameters, we fix the model and optimize the composition of the training data itself. We introduce a framework for discovering the optimal "training diet" from a large, unlabeled time series corpus. Our framework first uses a large-scale encoder and k-means clustering to partition the dataset into distinct, behaviorally consistent clusters. These clusters represent the fundamental 'ingredients' available for training. We then employ the Optuna optimization framework to search the high-dimensional space of possible data mixtures. For each trial, Optuna proposes a specific sampling ratio for each cluster, and a new training set is constructed based on this recipe. A smaller target model is then trained and evaluated. Our experiments reveal that this data-centric search consistently discovers data mixtures that yield models with significantly higher performance compared to baselines trained on the entire dataset. Specifically - evaluated on PMSM dataset - our method improved performance from a baseline MSE of 1.70 to 1.37, a 19.41% improvement.
训练深度学习模型处理传感器数据的标准范式通常认为更多的数据总是更好的。然而,原始的传感器流数据往往不平衡且包含大量冗余信息,这意味着并非所有的数据点对模型泛化都有同等贡献。 在这篇论文中,我们展示了在某些情况下,“少即是多”的原则适用于数据集选择。为此,我们将数据选择问题重新定义为:不是调整模型的超参数,而是固定模型并优化训练数据本身的组成。我们引入了一个框架来从大规模未标记的时间序列语料库中发现最优的“训练配方”。我们的框架首先使用一个大规模编码器和k-means聚类算法将数据集划分为行为一致且独立的数据簇。这些簇代表了可用于训练的基本“成分”。 接下来,我们利用Optuna优化框架在可能的数据混合高维空间中进行搜索。对于每一次试验,Optuna提出每个簇的具体采样比例,并根据这一配方构建新的训练集。然后使用一个较小的目标模型对其进行训练和评估。 我们的实验结果表明,这种以数据为中心的探索方法能够一致地发现数据混合方案,这些方案比基于整个数据集训练的基础线模型性能显著提高。具体来说,在PMSM数据集上进行测试时,我们改进的方法将基础线MSE从1.70降低到1.37,即提高了19.41%的性能。
https://arxiv.org/abs/2512.11546
Time series forecasting predicts future values from past data. In real-world settings, some anomalous events have lasting effects and influence the forecast, while others are short-lived and should be ignored. Standard forecasting models fail to make this distinction, often either overreacting to noise or missing persistent shifts. We propose Co-TSFA (Contrastive Time Series Forecasting with Anomalies), a regularization framework that learns when to ignore anomalies and when to respond. Co-TSFA generates input-only and input-output augmentations to model forecast-irrelevant and forecast-relevant anomalies, and introduces a latent-output alignment loss that ties representation changes to forecast changes. This encourages invariance to irrelevant perturbations while preserving sensitivity to meaningful distributional shifts. Experiments on the Traffic and Electricity benchmarks, as well as on a real-world cash-demand dataset, demonstrate that Co-TSFA improves performance under anomalous conditions while maintaining accuracy on normal data. An anonymized GitHub repository with the implementation of Co-TSFA is provided and will be made public upon acceptance.
时间序列预测是从过去的数据中预测未来的值。在现实世界的应用中,一些异常事件会产生持久的影响并影响预测结果,而另一些则是短暂的,并且应该被忽略。标准的时间序列预测模型无法区分这些情况,常常要么过度反应噪音,要么错过持续的变化。我们提出了一种新的框架——Co-TSFA(具有异常值对比时间序列预测),这是一个正则化框架,用于学习在何时忽视异常事件以及在何时做出响应。 Co-TSFA 通过生成只包含输入和包含输入输出的数据增强方法来建模与预测无关的和相关的异常情况,并引入了一种潜在输出对齐损失函数,该函数将表示变化与预测变化联系起来。这鼓励模型对于不重要的扰动保持不变性,同时保留了对有意义的分布变化的高度敏感性。 在交通和电力基准数据集上的实验以及一个真实的现金需求数据集上显示,Co-TSFA 在异常条件下提高了性能,并且还能保持正常情况下的准确性。我们提供了一个匿名的GitHub仓库用于实现 Co-TSFA,在论文被接受后将会公开发布代码。
https://arxiv.org/abs/2512.11526
Fine-scale forest monitoring is essential for understanding canopy structure and its dynamics, which are key indicators of carbon stocks, biodiversity, and forest health. Deep learning is particularly effective for this task, as it integrates spectral, temporal, and spatial signals that jointly reflect the canopy structure. To address this need, we introduce THREASURE-Net, a novel end-to-end framework for Tree Height Regression And Super-Resolution. The model is trained on Sentinel-2 time series using reference height metrics derived from LiDAR HD data at multiple spatial resolutions over Metropolitan France to produce annual height maps. We evaluate three model variants, producing tree-height predictions at 2.5 m, 5 m, and 10 m resolution. THREASURE-Net does not rely on any pretrained model nor on reference very high resolution optical imagery to train its super-resolution module; instead, it learns solely from LiDAR-derived height information. Our approach outperforms existing state-of-the-art methods based on Sentinel data and is competitive with methods based on very high resolution imagery. It can be deployed to generate high-precision annual canopy-height maps, achieving mean absolute errors of 2.62 m, 2.72 m, and 2.88 m at 2.5 m, 5 m, and 10 m resolution, respectively. These results highlight the potential of THREASURE-Net for scalable and cost-effective structural monitoring of temperate forests using only freely available satellite data. The source code for THREASURE-Net is available at: this https URL.
精细尺度的森林监测对于理解冠层结构及其动态变化至关重要,这些是衡量碳储量、生物多样性和森林健康状况的关键指标。深度学习在此任务中特别有效,因为它能够整合光谱、时间序列和空间信号,这些信号共同反映了冠层结构的情况。为了满足这一需求,我们引入了THREASURE-Net,这是一种新颖的端到端框架,用于树木高度回归及超分辨率生成。该模型利用Sentinel-2卫星的时间序列数据进行训练,并结合来自法国大都会地区多尺度空间分辨率LiDAR高精度数据所推导出的高度指标来生成年度高度图谱。我们评估了三种模型变体,在2.5米、5米和10米的分辨率下分别产生了树木高度预测。THREASURE-Net不依赖任何预训练模型,也不需要非常高的分辨率光学图像来训练其超分辨率模块;相反,它仅从LiDAR衍生的高度信息中学习。 我们的方法在基于Sentinel数据的方法中超越了现有最先进的技术,并且与基于非常高分辨率影像的方法具有竞争力。它可以部署以生成高精度的年度冠层高度图谱,在2.5米、5米和10米的分辨率下分别实现了平均绝对误差为2.62米、2.72米和2.88米的结果。这些结果突显了THREASURE-Net利用仅限于免费获取的卫星数据就具备对温带森林进行可扩展且成本效益高的结构监测的巨大潜力。 THREASURE-Net的源代码可以在以下链接找到:this https URL。
https://arxiv.org/abs/2512.11524
We study long-horizon exogenous-only temperature forecasting - a challenging univariate setting where only the past values of the indoor temperature are used for prediction - using linear and Transformer-family models. We evaluate Linear, NLinear, DLinear, Transformer, Informer, and Autoformer under standardized train, validation, and test splits. Results show that linear baselines (Linear, NLinear, DLinear) consistently outperform more complex Transformer-family architectures, with DLinear achieving the best overall accuracy across all splits. These findings highlight that carefully designed linear models remain strong baselines for time series forecasting in challenging exogenous-only settings.
我们研究了仅基于室内温度过去值进行预测的长期单一变量温度预报问题,采用线性模型和Transformer家族模型。我们在标准化的训练、验证和测试分割数据集上评估了Linear、NLinear、DLinear、Transformer、Informer以及Autoformer这几种模型的表现。结果表明,简单的线性基线模型(Linear、NLinear、DLinear)在所有分割中始终优于更复杂的Transformer架构,在各个数据分割中,DLinear模型取得了最佳的整体准确率。这些发现强调了精心设计的线性模型仍然是挑战性的单一外生变量时间序列预测中的强大基准模型。
https://arxiv.org/abs/2512.10866
Sensor-based human activity recognition (HAR) mines activity patterns from the time-series sensory data. In realistic scenarios, variations across individuals, devices, environments, and time introduce significant distributional shifts for the same activities. Recent efforts attempt to solve this challenge by applying or adapting existing out-of-distribution (OOD) algorithms, but only in certain distribution shift scenarios (e.g., cross-device or cross-position), lacking comprehensive insights on the effectiveness of these algorithms. For instance, is OOD necessary to HAR? Which OOD algorithm performs the best? In this paper, we fill this gap by proposing HAROOD, a comprehensive benchmark for HAR in OOD settings. We define 4 OOD scenarios: cross-person, cross-position, cross-dataset, and cross-time, and build a testbed covering 6 datasets, 16 comparative methods (implemented with CNN-based and Transformer-based architectures), and two model selection protocols. Then, we conduct extensive experiments and present several findings for future research, e.g., no single method consistently outperforms others, highlighting substantial opportunity for advancement. Our codebase is highly modular and easy to extend for new datasets, algorithms, comparisons, and analysis, with the hope to facilitate the research in OOD-based HAR. Our implementation is released and can be found at this https URL.
基于传感器的人体活动识别(HAR)从时间序列的传感数据中挖掘行为模式。在现实场景中,个体、设备、环境和时间的变化会引入同一活动中显著的分布变化。近期的努力尝试通过应用或调整现有的出站分布(OOD)算法来解决这一挑战,但仅限于某些分布变化情景(如跨设备或跨位置),缺乏对这些算法有效性全面的理解。例如,对于HAR来说OOD是否必要?哪种OOD算法表现最佳? 在本文中,我们填补了这一空白,提出了HAROOD,一个针对OOD设置的综合基准测试平台。我们定义了4种OOD场景:跨个人、跨位置、跨数据集和跨时间,并构建了一个包含6个数据集、16种对比方法(采用CNN基和Transformer基架构)、两个模型选择协议的实验环境。然后,我们进行了广泛的实验并提出了多项对未来研究有价值的发现,例如没有单一的方法能持续优于其他方法,这突显了在这个领域中巨大的进步空间。 我们的代码库高度模块化且易于扩展新的数据集、算法、对比以及分析内容,旨在促进基于OOD的人体活动识别的研究。我们的实现已经发布,并可在以下链接找到:[此URL]。
https://arxiv.org/abs/2512.10807
Attention improves representation learning over RNNs, but its discrete nature limits continuous-time (CT) modeling. We introduce Neuronal Attention Circuit (NAC), a novel, biologically plausible CT-Attention mechanism that reformulates attention logits computation as the solution to a linear first-order ODE with nonlinear interlinked gates derived from repurposing \textit{C. elegans} Neuronal Circuit Policies (NCPs) wiring mechanism. NAC replaces dense projections with sparse sensory gates for key-query projections and a sparse backbone network with two heads for computing \textit{content-target} and \textit{learnable time-constant} gates, enabling efficient adaptive dynamics. NAC supports three attention logit computation modes: (i) explicit Euler integration, (ii) exact closed-form solution, and (iii) steady-state approximation. To improve memory intensity, we implemented a sparse Top-\emph{K} pairwise concatenation scheme that selectively curates key-query interactions. We provide rigorous theoretical guarantees, including state stability, bounded approximation errors, and universal approximation. Empirically, we implemented NAC in diverse domains, including irregular time-series classification, lane-keeping for autonomous vehicles, and industrial prognostics. We observed that NAC matches or outperforms competing baselines in accuracy and occupies an intermediate position in runtime and memory efficiency compared with several CT baselines.
注意力机制提升了基于RNN的表示学习,但其离散性质限制了连续时间(CT)建模。我们提出了一种新颖且生物上合理的CT-Attention机制——神经元注意电路(NAC),它将注意力得分计算重新表述为一个线性一阶常微分方程(ODE)的解,并采用了非线性互联门,这些门源于秀丽隐杆线虫(*C. elegans*)神经回路政策(NCPs)的连接机制。NAC用稀疏感觉门替代了密集投影来处理键-查询投影,并使用了一个有两个头部的稀疏骨干网络来计算内容目标和可学习的时间常数门,从而实现高效的自适应动态。 NAC支持三种注意力得分计算模式: (i) 显式欧拉积分, (ii) 准确的闭合形式解, (iii) 稳态近似。 为了提高内存效率,我们实施了一种稀疏Top-\emph{K}成对连接方案,该方案选择性地整理键-查询交互。我们提供了严格的理论保证,包括状态稳定性、有界的近似误差以及通用逼近能力。在实践中,我们在多个领域实现了NAC,包括不规则时间序列分类、自动驾驶车辆车道保持及工业预测诊断。观察到,在准确性和内存效率方面,NAC要么与竞争基线持平,要么超越它们;而在运行时性能上,则介于几种CT基线之间。
https://arxiv.org/abs/2512.10282
Time series forecasting is a critical task for artificial intelligence with numerous real-world applications. Traditional approaches primarily rely on historical time series data to predict the future values. However, in practical scenarios, this is often insufficient for accurate predictions due to the limited information available. To address this challenge, multimodal time series forecasting methods which incorporate additional data modalities, mainly text data, alongside time series data have been explored. In this work, we introduce the Adaptive Information Routing (AIR) framework, a novel approach for multimodal time series forecasting. Unlike existing methods that treat text data on par with time series data as interchangeable auxiliary features for forecasting, AIR leverages text information to dynamically guide the time series model by controlling how and to what extent multivariate time series information should be combined. We also present a text-refinement pipeline that employs a large language model to convert raw text data into a form suitable for multimodal forecasting, and we introduce a benchmark that facilitates multimodal forecasting experiments based on this pipeline. Experiment results with the real world market data such as crude oil price and exchange rates demonstrate that AIR effectively modulates the behavior of the time series model using textual inputs, significantly enhancing forecasting accuracy in various time series forecasting tasks.
时间序列预测是人工智能中的一个关键任务,具有众多现实世界的应用场景。传统的做法主要依赖于历史时间序列数据来预测未来的值,然而在实际应用中,由于可用信息的限制,这种方法往往难以实现准确的预测。为了解决这一挑战,研究人员开始探索多模态时间序列预测方法,即结合额外的数据模式(主要是文本数据)与时间序列数据进行综合分析。在此研究工作中,我们介绍了一种创新的方法——自适应信息路由(AIR)框架,用于多模态时间序列预测。 不同于现有方法将文本数据视为可以互换的时间序列辅助特征,AIR利用文本信息动态地引导时间序列模型,通过控制如何以及在何种程度上组合多元时间序列信息。此外,我们还提出了一种基于大型语言模型的文本细化流程,它能够将原始文本数据转化为适合多模态预测的形式,并引入了一个基准测试以促进基于该流程的多模态预测实验。 使用真实世界市场数据(如原油价格和汇率)进行的实验结果显示,AIR框架能够利用文本输入有效调节时间序列模型的行为,在各种时间序列预测任务中显著提高了预测准确性。
https://arxiv.org/abs/2512.10229
Time series forecasting requires models that can efficiently capture complex temporal dependencies, especially in large-scale and high-dimensional settings. While Transformer-based architectures excel at modeling long-range dependencies, their quadratic computational complexity poses limitations on scalability and adaptability. To overcome these challenges, we introduce DB2-TransF, a novel Transformer-inspired architecture that replaces the self-attention mechanism with a learnable Daubechies wavelet coefficient layer. This wavelet-based module efficiently captures multi-scale local and global patterns and enhances the modeling of correlations across multiple time series for the time series forecasting task. Extensive experiments on 13 standard forecasting benchmarks demonstrate that DB2-TransF achieves comparable or superior predictive accuracy to conventional Transformers, while substantially reducing memory usage for the time series forecasting task. The obtained experimental results position DB2-TransF as a scalable and resource-efficient framework for advanced time series forecasting. Our code is available at this https URL
时间序列预测需要能够有效捕捉复杂时序依赖关系的模型,特别是在大规模和高维度的情况下。虽然基于Transformer的架构在建模长程依赖性方面表现出色,但其二次方级别的计算复杂度限制了可扩展性和适应性。为了解决这些挑战,我们提出了DB2-TransF,这是一种新型的受Transformer启发的架构,它用一个可学习的小波系数层替换了自注意力机制。基于这种小波模块,可以高效地捕捉多尺度局部和全局模式,并且增强多个时间序列之间相关性的建模能力,从而提升时间序列预测任务的效果。在13个标准的时间序列预测基准测试中进行的广泛实验表明,DB2-TransF与传统的Transformer相比,在预测准确性上达到了相当甚至更好的水平,同时显著减少了内存使用量以适应时间序列预测任务的要求。所获得的实验结果将DB2-TransF定位为一种可扩展且资源高效的框架,适用于高级时间序列预测任务。我们的代码可以在提供的链接中获取(此https URL)。
https://arxiv.org/abs/2512.10051
In-context learning with attention enables large neural networks to make context-specific predictions by selectively focusing on relevant examples. Here, we adapt this idea to supervised learning procedures such as lasso regression and gradient boosting, for tabular data. Our goals are to (1) flexibly fit personalized models for each prediction point and (2) retain model simplicity and interpretability. Our method fits a local model for each test observation by weighting the training data according to attention, a supervised similarity measure that emphasizes features and interactions that are predictive of the outcome. Attention weighting allows the method to adapt to heterogeneous data in a data-driven way, without requiring cluster or similarity pre-specification. Further, our approach is uniquely interpretable: for each test observation, we identify which features are most predictive and which training observations are most relevant. We then show how to use attention weighting for time series and spatial data, and we present a method for adapting pretrained tree-based models to distributional shift using attention-weighted residual corrections. Across real and simulated datasets, attention weighting improves predictive performance while preserving interpretability, and theory shows that attention-weighting linear models attain lower mean squared error than the standard linear model under mixture-of-models data-generating processes with known subgroup structure.
上下文学习结合注意力机制使大型神经网络能够通过选择性聚焦于相关示例来做出特定于上下文的预测。在这里,我们将这一理念应用于监督学习程序(如套索回归和梯度提升)以处理表格数据。我们的目标是:(1)为每个预测点灵活地拟合个性化的模型;(2)保持模型简洁性和可解释性。我们的方法通过根据注意力加权训练数据来为每个测试观测值拟合局部模型,这里的注意力是一种监督相似性度量,它强调了对结果具有预测性的特征和交互作用。这种注意力权重机制使得方法能够以一种数据驱动的方式适应异构数据,并且不需要预先指定聚类或相似性。 此外,我们的方法具有独特的可解释性:对于每个测试观测值,我们都可以识别出哪些特征最具预测力以及哪些训练样本最为相关。然后,我们将展示如何使用注意力加权处理时间序列和空间数据,并提出一种使用注意力加权残差修正来适应预训练树基模型的方法以应对分布变化。 在真实和模拟的数据集中,通过保持可解释性的同时提高预测性能,注意力加权表现出色。理论分析表明,在具有已知子群体结构的混合模型数据生成过程中,通过对线性模型进行注意力权重调整可以实现比标准线性模型更低的均方误差。
https://arxiv.org/abs/2512.09912
Always-on sensors are increasingly expected to embark a variety of tiny neural networks and to continuously perform inference on time-series of the data they sense. In order to fit lifetime and energy consumption requirements when operating on battery, such hardware uses microcontrollers (MCUs) with tiny memory budget e.g., 128kB of RAM. In this context, optimizing data flows across neural network layers becomes crucial. In this paper, we introduce TinyDéjà Vu, a new framework and novel algorithms we designed to drastically reduce the RAM footprint required by inference using various tiny ML models for sensor data time-series on typical microcontroller hardware. We publish the implementation of TinyDéjà Vu as open source, and we perform reproducible benchmarks on hardware. We show that TinyDéjà Vu can save more than 60% of RAM usage and eliminate up to 90% of redundant compute on overlapping sliding window inputs.
始终开启的传感器越来越需要搭载各种微型神经网络,并且要能够对所采集的数据的时间序列进行连续推理。为了满足在电池供电下的使用寿命和能耗要求,此类硬件通常采用内存预算很小的微控制器(MCUs),例如仅配备128kB RAM。在这种背景下,优化数据流穿越神经网络各层变得至关重要。 本文中,我们介绍了TinyDéjÆVu,这是一个新的框架以及一系列新颖算法的设计成果,旨在显著减少使用各种微型ML模型在典型微控制器硬件上对传感器数据时间序列进行推理时所需的RAM占用量。我们将TinyDéjÆVu的实现作为开源代码发布,并在其硬件平台上进行了可重复性基准测试。我们展示了TinyDéjÆVu能够节省超过60%的RAM使用量,并且在重叠滑动窗口输入的情况下,最多可以消除90%的冗余计算。
https://arxiv.org/abs/2512.09786
Cloud cover in multispectral imagery (MSI) significantly hinders early-season crop mapping by corrupting spectral information. Existing Vision Transformer(ViT)-based time-series reconstruction methods, like SMTS-ViT, often employ coarse temporal embeddings that aggregate entire sequences, causing substantial information loss and reducing reconstruction accuracy. To address these limitations, a Video Vision Transformer (ViViT)-based framework with temporal-spatial fusion embedding for MSI reconstruction in cloud-covered regions is proposed in this study. Non-overlapping tubelets are extracted via 3D convolution with constrained temporal span $(t=2)$, ensuring local temporal coherence while reducing cross-day information degradation. Both MSI-only and SAR-MSI fusion scenarios are considered during the experiments. Comprehensive experiments on 2020 Traill County data demonstrate notable performance improvements: MTS-ViViT achieves a 2.23\% reduction in MSE compared to the MTS-ViT baseline, while SMTS-ViViT achieves a 10.33\% improvement with SAR integration over the SMTS-ViT baseline. The proposed framework effectively enhances spectral reconstruction quality for robust agricultural monitoring.
云层覆盖在多光谱图像(MSI)中显著阻碍了早期作物制图,因为它会破坏光谱信息。现有的基于视觉变换器(Vision Transformer, ViT)的时间序列重构方法,如SMTS-ViT,通常使用粗略的时间嵌入来聚合整个序列,导致大量信息丢失并降低重构精度。为了解决这些问题,在本研究中提出了一种基于视频视觉变换器(Video Vision Transformer, ViViT)的框架,并结合时间-空间融合嵌入方法,用于在云覆盖区域进行MSI重建。 该框架通过3D卷积提取非重叠管状体(tubelets),并限制其时间跨度$(t=2)$,确保局部时间连续性同时减少跨日信息退化。实验中既考虑了仅使用MSI的情况,也考虑了将合成孔径雷达(SAR)与MSI融合的场景。 在2020年Traill County数据上的全面实验表明,MTS-ViViT相比MTS-ViT基准模型实现了MSE减少2.23%,而结合SAR后的SMTS-ViViT则相较于SMTS-ViT基线提升了10.33%的性能。这一提出的框架有效地提高了光谱重建的质量,为农业监测提供了更强健的支持。
https://arxiv.org/abs/2512.09471
Foundation models pretrained on large data have demonstrated remarkable zero-shot generalization capabilities across domains. Building on the success of TabPFN for tabular data and its recent extension to time series, we investigate whether graph node classification can be effectively reformulated as a tabular learning problem. We introduce TabPFN-GN, which transforms graph data into tabular features by extracting node attributes, structural properties, positional encodings, and optionally smoothed neighborhood features. This enables TabPFN to perform direct node classification without any graph-specific training or language model dependencies. Our experiments on 12 benchmark datasets reveal that TabPFN-GN achieves competitive performance with GNNs on homophilous graphs and consistently outperforms them on heterophilous graphs. These results demonstrate that principled feature engineering can bridge the gap between tabular and graph domains, providing a practical alternative to task-specific GNN training and LLM-dependent graph foundation models.
在大规模数据上预训练的基础模型已经在跨域的零样本泛化能力方面表现出色。基于TabPFN在表格数据上的成功及其最近向时间序列领域的扩展,我们探讨了图节点分类能否被有效地重新表述为一个表格学习问题。为此,我们引入了TabPFN-GN方法,该方法通过提取节点属性、结构特性、位置编码以及可选的平滑邻域特征,将图数据转化为表格特征。这种方法使TabPFN能够在不进行任何特定于图的数据训练或依赖语言模型的情况下直接执行节点分类。 我们在12个基准数据集上的实验表明,TabPFN-GN在同质图上与图神经网络(GNNs)的性能相当,在异质图上则持续优于它们。这些结果证明了有原则性的特征工程可以弥合表格和图领域的差距,并提供了任务特定的GNN训练或依赖于大型语言模型的图基础模型的实际替代方案。
https://arxiv.org/abs/2512.08798
Stock market prediction is a long-standing challenge in finance, as accurate forecasts support informed investment decisions. Traditional models rely mainly on historical prices, but recent work shows that financial news can provide useful external signals. This paper investigates a multimodal approach that integrates companies' news articles with their historical stock data to improve prediction performance. We compare a Graph Neural Network (GNN) model with a baseline LSTM model. Historical data for each company is encoded using an LSTM, while news titles are embedded with a language model. These embeddings form nodes in a heterogeneous graph, and GraphSAGE is used to capture interactions between articles, companies, and industries. We evaluate two targets: a binary direction-of-change label and a significance-based label. Experiments on the US equities and Bloomberg datasets show that the GNN outperforms the LSTM baseline, achieving 53% accuracy on the first target and a 4% precision gain on the second. Results also indicate that companies with more associated news yield higher prediction accuracy. Moreover, headlines contain stronger predictive signals than full articles, suggesting that concise news summaries play an important role in short-term market reactions.
股票市场的预测是金融领域长期面临的挑战,准确的预估能够支持明智的投资决策。传统的模型主要依赖于历史价格数据,但近期的研究表明,金融新闻可以提供有用的外部信号。本文探讨了一种多模态方法,该方法结合了公司的新闻文章与其历史股价数据,以提高预测性能。我们对比了图神经网络(GNN)模型和基准长短时记忆网络(LSTM)模型。 对于每个公司,使用LSTM对历史数据进行编码,而新闻标题则通过语言模型嵌入表示。这些嵌入形成了异构图中的节点,并且GraphSAGE被用来捕捉文章、公司及行业之间的相互作用。我们在两个目标上进行了评估:一个是变化方向的二元标签,另一个是基于重要性的标签。 在对美国股票和Bloomberg数据集进行实验后,结果显示GNN模型优于LSTM基准模型,在第一个预测目标上的准确率达到了53%,而在第二个预测目标上的精确度提高了4%。结果还表明,拥有更多相关新闻报道的公司能够获得更高的预测准确性。此外,相比完整文章,标题中包含了更强有力的预测信号,这表明简要的新闻摘要在短期内对市场反应起到重要作用。
https://arxiv.org/abs/2512.08567