Time series forecasting (TSF) has long been a crucial task in both industry and daily life. Most classical statistical models may have certain limitations when applied to practical scenarios in fields such as energy, healthcare, traffic, meteorology, and economics, especially when high accuracy is required. With the continuous development of deep learning, numerous new models have emerged in the field of time series forecasting in recent years. However, existing surveys have not provided a unified summary of the wide range of model architectures in this field, nor have they given detailed summaries of works in feature extraction and datasets. To address this gap, in this review, we comprehensively study the previous works and summarize the general paradigms of Deep Time Series Forecasting (DTSF) in terms of model architectures. Besides, we take an innovative approach by focusing on the composition of time series and systematically explain important feature extraction methods. Additionally, we provide an overall compilation of datasets from various domains in existing works. Finally, we systematically emphasize the significant challenges faced and future research directions in this field.
时间序列预测(TSF)长期以来一直是工业和日常生活中的一项重要任务。大多数经典统计模型在诸如能源、医疗保健、交通、气象学和经济学等领域的实际应用场景中应用时,特别是在需要高精度的情况下,可能会有一定的局限性。随着深度学习的不断发展,在时间序列预测领域近年来出现了许多新的模型。然而,现有的综述并没有为这一领域广泛多样的模型架构提供一个统一的总结,也没有对特征提取方法及其在不同数据集上的工作进行详细的总结。为了弥补这一空白,本文全面回顾了之前的相关研究,并从模型架构的角度总结了深度时间序列预测(DTSF)的一般范式。此外,我们采用了一种创新的方法,专注于时间序列的组成,并系统地解释了重要的特征提取方法。另外,我们还提供了现有工作中来自各种领域的数据集的整体编译。最后,本文系统地强调了该领域面临的重大挑战和未来的研究方向。
https://arxiv.org/abs/2503.10198
Introduction: Timely care in a specialised neuro-intensive therapy unit (ITU) reduces mortality and hospital stays, with planned admissions being safer than unplanned ones. However, post-operative care decisions remain subjective. This study used artificial intelligence (AI), specifically natural language processing (NLP) to analyse electronic health records (EHRs) and predict ITU admissions for elective surgery patients. Methods: This study analysed the EHRs of elective neurosurgery patients from University College London Hospital (UCLH) using NLP. Patients were categorised into planned high dependency unit (HDU) or ITU admission; unplanned HDU or ITU admission; or ward / overnight recovery (ONR). The Medical Concept Annotation Tool (MedCAT) was used to identify SNOMED-CT concepts within the clinical notes. We then explored the utility of these identified concepts for a range of AI algorithms trained to predict ITU admission. Results: The CogStack-MedCAT NLP model, initially trained on hospital-wide EHRs, underwent two refinements: first with data from patients with Normal Pressure Hydrocephalus (NPH) and then with data from Vestibular Schwannoma (VS) patients, achieving a concept detection F1-score of 0.93. This refined model was then used to extract concepts from EHR notes of 2,268 eligible neurosurgical patients. We integrated the extracted concepts into AI models, including a decision tree model and a neural time-series model. Using the simpler decision tree model, we achieved a recall of 0.87 (CI 0.82 - 0.91) for ITU admissions, reducing the proportion of unplanned ITU cases missed by human experts from 36% to 4%. Conclusion: The NLP model, refined for accuracy, has proven its efficiency in extracting relevant concepts, providing a reliable basis for predictive AI models to use in clinically valid applications.
引言:在专门的神经重症监护治疗单元(ITU)中及时护理可以减少死亡率和住院时间,计划内入院比非计划入院更安全。然而,术后护理决策仍然具有主观性。本研究利用人工智能(AI),特别是自然语言处理(NLP)技术来分析电子健康记录(EHRs)并预测择期手术患者入住ITU的情况。 方法:该研究使用NLP技术分析了来自伦敦大学学院医院(UCLH)的择期神经外科患者的EHR数据。患者被分类为计划内的高依赖单元(HDU)或ITU入院;非计划性的HDU或ITU入院;或者普通病房/过夜恢复室(ONR)。使用医疗概念注释工具(MedCAT),在临床记录中识别SNOMED-CT概念。然后,我们探索了这些已识别的概念对于一系列用于预测ITU入院的AI算法的实用性。 结果:经过医院范围内的EHR数据初步训练后的CogStack-MedCAT NLP模型,在使用正常压力脑积水(NPH)患者的数据进行了第一次优化后,再次通过使用听神经瘤(VS)患者的额外数据进行第二次优化,最终实现了概念检测F1分数为0.93。这个经过优化的模型随后被用来从2,268名符合条件的神经外科患者的EHR笔记中提取概念。我们将这些提取的概念整合到了包括决策树模型和神经时间序列模型在内的AI模型中。使用相对简单的决策树模型,我们达到了ITU入院召回率为0.87(CI 0.82 - 0.91),将非计划性ITU病例漏诊比例从专家人工判断的36%降至4%。 结论:经过优化精度后的NLP模型证明了其在提取相关概念方面的高效性,为预测性AI模型提供了可靠的基础,适用于临床应用。
https://arxiv.org/abs/2503.09927
Time series classification (TSC) is a cornerstone of modern web applications, powering tasks such as financial data analysis, network traffic monitoring, and user behavior analysis. In recent years, deep neural networks (DNNs) have greatly enhanced the performance of TSC models in these critical domains. However, DNNs are vulnerable to backdoor attacks, where attackers can covertly implant triggers into models to induce malicious outcomes. Existing backdoor attacks targeting DNN-based TSC models remain elementary. In particular, early methods borrow trigger designs from computer vision, which are ineffective for time series data. More recent approaches utilize generative models for trigger generation, but at the cost of significant computational complexity. In this work, we analyze the limitations of existing attacks and introduce an enhanced method, FreqBack. Drawing inspiration from the fact that DNN models inherently capture frequency domain features in time series data, we identify that improper perturbations in the frequency domain are the root cause of ineffective attacks. To address this, we propose to generate triggers both effectively and efficiently, guided by frequency analysis. FreqBack exhibits substantial performance across five models and eight datasets, achieving an impressive attack success rate of over 90%, while maintaining less than a 3% drop in model accuracy on clean data.
时间序列分类(TSC)是现代网页应用的基石,支持诸如金融数据分析、网络流量监控和用户行为分析等任务。近年来,深度神经网络(DNNs)大大提升了这些关键领域中TSC模型的表现。然而,DNN容易受到后门攻击的影响,即攻击者可以秘密地在模型中植入触发器以诱导恶意结果。目前针对基于DNN的时间序列分类模型的后门攻击仍处于初级阶段。特别是早期的方法借鉴了计算机视觉领域的触发设计,但这些设计对于时间序列数据来说效果不佳。近期的一些方法利用生成模型来产生触发器,但却牺牲了大量的计算复杂度。 在这项工作中,我们分析了现有攻击方法的局限性,并引入了一种增强型方法FreqBack。受到DNN模型在处理时间序列数据时会固有地捕捉频域特征这一事实的启发,我们发现频域中的不当扰动是导致无效攻击的根本原因。为此,我们提出利用频率分析来有效地和高效地生成触发器。 实验结果表明,在五个模型和八个数据集上,FreqBack展现了显著的表现力,达到了超过90%的成功率,并且在未受污染的数据上的准确度下降不超过3%,从而实现了较高的攻击效果与模型精度之间的良好平衡。
https://arxiv.org/abs/2503.09712
Time Series Forecasting (TSF) is critical in many real-world domains like financial planning and health monitoring. Recent studies have revealed that Large Language Models (LLMs), with their powerful in-contextual modeling capabilities, hold significant potential for TSF. However, existing LLM-based methods usually perform suboptimally because they neglect the inherent characteristics of time series data. Unlike the textual data used in LLM pre-training, the time series data is semantically sparse and comprises distinctive temporal patterns. To address this problem, we propose LLM-PS to empower the LLM for TSF by learning the fundamental \textit{Patterns} and meaningful \textit{Semantics} from time series data. Our LLM-PS incorporates a new multi-scale convolutional neural network adept at capturing both short-term fluctuations and long-term trends within the time series. Meanwhile, we introduce a time-to-text module for extracting valuable semantics across continuous time intervals rather than isolated time points. By integrating these patterns and semantics, LLM-PS effectively models temporal dependencies, enabling a deep comprehension of time series and delivering accurate forecasts. Intensive experimental results demonstrate that LLM-PS achieves state-of-the-art performance in both short- and long-term forecasting tasks, as well as in few- and zero-shot settings.
时间序列预测(TSF)在金融规划和健康监测等许多实际领域中至关重要。近期研究表明,大型语言模型(LLMs)凭借其强大的上下文建模能力,在TSF方面具有巨大潜力。然而,现有的基于LLM的方法通常表现不佳,因为它们忽视了时间序列数据的独特特性。与用于训练LLM的文本数据不同,时间序列数据语义稀疏,并包含独特的时序模式。为了解决这一问题,我们提出了一种新的方法——LLM-PS(Large Language Model with Patterns and Semantics),通过从时间序列数据中学习基础的\textit{Pattern}和有意义的\textit{Semantics}来增强LLM进行TSF的能力。 我们的LLM-PS融合了一个新颖的多尺度卷积神经网络,该网络擅长捕捉时间序列中的短期波动和长期趋势。同时,我们引入了一种时序到文本模块,用于提取连续时间段而非孤立时间点上的有价值语义信息。通过整合这些模式与语义,LLM-PS能够有效建模时间依赖关系,从而深入理解时间序列并提供准确的预测。 密集型实验结果表明,LLM-PS在短期和长期预测任务中均达到了最先进的性能,并且在少量样本甚至零样本设置下也表现出色。
https://arxiv.org/abs/2503.09656
This paper presents Ev-Layout, a novel large-scale event-based multi-modal dataset designed for indoor layout estimation and tracking. Ev-Layout makes key contributions to the community by: Utilizing a hybrid data collection platform (with a head-mounted display and VR interface) that integrates both RGB and bio-inspired event cameras to capture indoor layouts in motion. Incorporating time-series data from inertial measurement units (IMUs) and ambient lighting conditions recorded during data collection to highlight the potential impact of motion speed and lighting on layout estimation accuracy. The dataset consists of 2.5K sequences, including over 771.3K RGB images and 10 billion event data points. Of these, 39K images are annotated with indoor layouts, enabling research in both event-based and video-based indoor layout estimation. Based on the dataset, we propose an event-based layout estimation pipeline with a novel event-temporal distribution feature module to effectively aggregate the spatio-temporal information from events. Additionally, we introduce a spatio-temporal feature fusion module that can be easily integrated into a transformer module for fusion purposes. Finally, we conduct benchmarking and extensive experiments on the Ev-Layout dataset, demonstrating that our approach significantly improves the accuracy of dynamic indoor layout estimation compared to existing event-based methods.
本文介绍了Ev-Layout,这是一个新颖的大型事件驱动多模态数据集,旨在用于室内布局估计和跟踪。Ev-Layout 对社区的主要贡献包括: 1. **利用混合数据采集平台**:该平台结合了头戴式显示器和虚拟现实界面,并集成了RGB相机与生物启发式的事件相机来捕捉运动中的室内布局。 2. **融合时间序列数据**:从惯性测量单元(IMU)以及在数据采集过程中记录的环境照明条件的时间序列数据,以突出运动速度和光照对布局估计准确性的影响。 该数据集包含2500个序列,包括超过771300张RGB图像和100亿个事件数据点。其中,有39000张图像被标注了室内布局信息,这使得基于事件驱动的和视频基础的室内布局估计的研究成为可能。 基于该数据集,我们提出了一种基于事件的布局估计流水线,并引入了一个新颖的事件时间分布特征模块以有效聚合来自事件的空间-时间信息。此外,还介绍了一个可以轻松集成到变换器(transformer)模块中的时空特征融合模块,以便于进行特征融合操作。 最后,在Ev-Layout数据集上进行了基准测试和广泛实验,证明我们的方法相比现有的基于事件的方法显著提高了动态室内布局估计的准确性。
https://arxiv.org/abs/2503.08370
Egocentric video-based models capture rich semantic information and have demonstrated strong performance in human activity recognition (HAR). However, their high power consumption, privacy concerns, and dependence on lighting conditions limit their feasibility for continuous on-device recognition. In contrast, inertial measurement unit (IMU) sensors offer an energy-efficient and privacy-preserving alternative, yet they suffer from limited large-scale annotated datasets, leading to weaker generalization in downstream tasks. To bridge this gap, we propose COMODO, a cross-modal self-supervised distillation framework that transfers rich semantic knowledge from the video modality to the IMU modality without requiring labeled annotations. COMODO leverages a pretrained and frozen video encoder to construct a dynamic instance queue, aligning the feature distributions of video and IMU embeddings. By distilling knowledge from video representations, our approach enables the IMU encoder to inherit rich semantic information from video while preserving its efficiency for real-world applications. Experiments on multiple egocentric HAR datasets demonstrate that COMODO consistently improves downstream classification performance, achieving results comparable to or exceeding fully supervised fine-tuned models. Moreover, COMODO exhibits strong cross-dataset generalization. Benefiting from its simplicity, our method is also generally applicable to various video and time-series pre-trained models, offering the potential to leverage more powerful teacher and student foundation models in future research. The code is available at this https URL .
基于自我的视频模型能够捕捉到丰富的语义信息,并在人类活动识别(HAR)方面表现出强大的性能。然而,它们的高能耗、隐私问题以及对光照条件的依赖限制了其在设备上持续运行的可能性。相比之下,惯性测量单元(IMU)传感器提供了一种节能且保护隐私的替代方案,但它们由于缺乏大规模标注数据集而面临着下游任务中泛化能力较弱的问题。为了解决这个问题,我们提出了COMODO,这是一个跨模态自监督蒸馏框架,它能够在没有标签注释的情况下将视频模态中的丰富语义知识转移到IMU模态。COMODO利用预先训练且冻结的视频编码器来构建动态实例队列,并对齐视频和IMU嵌入特征分布。通过从视频表示中提取知识,我们的方法使IMU编码器能够继承来自视频的丰富的语义信息,同时保持其实用性以适应现实世界的应用场景。 在多个基于第一人称视角的人类活动识别数据集上的实验表明,COMODO在下游分类性能上持续改进,并且其结果可以与完全监督微调模型相媲美甚至超出。此外,COMODO还表现出强大的跨数据集泛化能力。得益于简单的设计,我们的方法适用于各种视频和时间序列预训练模型,在未来的研究中有可能利用更加强大的教师和学生基础模型。 该代码可在 [此链接](https://this https URL) 获取(原文中的URL需要替换为有效的实际地址)。
https://arxiv.org/abs/2503.07259
Time series analysis is crucial in fields like finance, transportation, and industry. However, traditional models often focus solely on temporal features, limiting their ability to capture underlying information. This paper proposes a novel time series multitask framework, called LTM, which integrates temporal features with textual descriptions to enhance analytical and predictive capabilities. LTM combines pre-trained time series model, large language model (LLM), and knowledge graph to tackle time series tasks, including forecasting, imputation, and anomaly detection. LTM achieves improved performance with a few trainable parameters. It is very efficient and practical. LTM encodes time series data into patches and enriches user-provided prompts using knowledge graphs to generate enhanced prompts. A novel feature fusion method embeds prompts into each patch encoding, which is processed by a frozen LLM, followed by a feature enhancement module and a time decoder module. During fine-tuning stage, cosine similarity between prompts and temporal patches is integrated into the loss function to boost performance. Experiments on benchmark datasets show that LTM significantly outperforms existing methods. It provides a robust and versatile solution for time series tasks.
时间序列分析在金融、交通运输和工业等领域中至关重要。然而,传统的模型往往仅关注时间特征,限制了它们捕捉潜在信息的能力。本文提出了一种新的时间序列多任务框架,称为LTM(Long-Term Memory),该框架将时间特征与文本描述相结合,以增强其分析和预测能力。LTM结合预训练的时间序列模型、大型语言模型(LLM)以及知识图谱来处理时间序列任务,包括预测、填充缺失值及异常检测等。 LTM通过引入少量可调参数实现了性能的提升,并且非常高效实用。具体而言,LTM将时间序列数据编码为补丁(patch),并通过知识图谱增强用户提供的提示(prompt)生成强化后的提示。一种新颖的功能融合方法将这些提示嵌入每个补丁编码中,接着由冻结的LLM处理,再通过功能增强模块和时间解码器模块进行进一步处理。在微调阶段,提示与时间序列片段之间的余弦相似度被整合到损失函数(loss function)中以提升性能。 实验结果表明,在基准数据集上的测试显示LTM显著优于现有方法,并且为时间序列任务提供了一种稳健而多用途的解决方案。
https://arxiv.org/abs/2503.07682
Class-incremental learning (CIL) for time series data faces critical challenges in balancing stability against catastrophic forgetting and plasticity for new knowledge acquisition, particularly under real-world constraints where historical data access is restricted. While pre-trained models (PTMs) have shown promise in CIL for vision and NLP domains, their potential in time series class-incremental learning (TSCIL) remains underexplored due to the scarcity of large-scale time series pre-trained models. Prompted by the recent emergence of large-scale pre-trained models (PTMs) for time series data, we present the first exploration of PTM-based Time Series Class-Incremental Learning (TSCIL). Our approach leverages frozen PTM backbones coupled with incrementally tuning the shared adapter, preserving generalization capabilities while mitigating feature drift through knowledge distillation. Furthermore, we introduce a Feature Drift Compensation Network (DCN), designed with a novel two-stage training strategy to precisely model feature space transformations across incremental tasks. This allows for accurate projection of old class prototypes into the new feature space. By employing DCN-corrected prototypes, we effectively enhance the unified classifier retraining, mitigating model feature drift and alleviating catastrophic forgetting. Extensive experiments on five real-world datasets demonstrate state-of-the-art performance, with our method yielding final accuracy gains of 1.4%-6.1% across all datasets compared to existing PTM-based approaches. Our work establishes a new paradigm for TSCIL, providing insights into stability-plasticity optimization for continual learning systems.
针对时间序列数据的类增量学习(Class-incremental Learning,CIL)面临关键挑战,在保持模型稳定性以防止灾难性遗忘的同时,还需具备获取新知识的能力。尤其在现实世界的约束下,如历史数据访问受限的情况下,这些挑战变得更加严峻。尽管预训练模型(Pre-trained Models, PTMs)已经在视觉和自然语言处理领域展现出其在类增量学习中的潜力,但由于缺乏大规模的时间序列预训练模型,在时间序列类增量学习(Time Series Class-Incremental Learning,TSCIL)方面的应用尚处于探索阶段。 鉴于近期出现了针对时间序列数据的大规模预训练模型,我们首次提出了基于PTM的TSCIL方法。我们的方法利用冻结的PTM骨干网络,并结合逐渐调整共享适配器来保持泛化能力的同时通过知识蒸馏减轻特征漂移。此外,我们引入了特征漂移补偿网络(Drift Compensation Network, DCN),该网络采用新颖的两阶段训练策略精确建模增量任务之间的特征空间变化。这使得旧类原型能够被准确地投影到新的特征空间中。 通过使用DCN校正后的原型进行统一分类器重新训练,我们有效减轻了模型特征漂移,并缓解了灾难性遗忘问题。在五个现实世界数据集上的广泛实验表明,与现有的基于PTM的方法相比,我们的方法实现了1.4%-6.1%的最终精度提升。这项工作为TSCIL建立了新的范式,为持续学习系统的稳定性-可塑性优化提供了见解。
https://arxiv.org/abs/2503.07153
With the recent development and advancement of Transformer and MLP architectures, significant strides have been made in time series analysis. Conversely, the performance of Convolutional Neural Networks (CNNs) in time series analysis has fallen short of expectations, diminishing their potential for future applications. Our research aims to enhance the representational capacity of Convolutional Neural Networks (CNNs) in time series analysis by introducing novel perspectives and design innovations. To be specific, We introduce a novel time series reshaping technique that considers the inter-patch, intra-patch, and cross-variable dimensions. Consequently, we propose TVNet, a dynamic convolutional network leveraging a 3D perspective to employ time series analysis. TVNet retains the computational efficiency of CNNs and achieves state-of-the-art results in five key time series analysis tasks, offering a superior balance of efficiency and performance over the state-of-the-art Transformer-based and MLP-based models. Additionally, our findings suggest that TVNet exhibits enhanced transferability and robustness. Therefore, it provides a new perspective for applying CNN in advanced time series analysis tasks.
随着Transformer和多层感知机(MLP)架构的发展,时间序列分析取得了显著进展。然而,卷积神经网络(CNNs)在时间序列分析中的表现未能达到预期水平,这限制了它们未来的应用潜力。我们的研究旨在通过引入新颖的视角和设计创新来增强卷积神经网络(CNNs)在时间序列分析中的表征能力。具体来说,我们提出了一种新的时间序列重塑技术,该技术考虑到了不同补丁之间、同一补丁内部以及跨变量维度之间的关系。 基于此,我们提出了TVNet,这是一种动态的卷积网络,采用三维视角进行时间序列分析。TVNet保持了CNNs的计算效率,并在五个关键的时间序列分析任务中实现了最先进的结果,在性能与效率上超过了现有的Transformer和MLP模型。此外,我们的研究发现表明,TVNet展现出了增强的迁移能力和鲁棒性。因此,它为将卷积神经网络应用于高级时间序列分析任务提供了新的视角。
https://arxiv.org/abs/2503.07674
Time series forecasting (TSF) plays a crucial role in many applications. Transformer-based methods are one of the mainstream techniques for TSF. Existing methods treat all token dependencies equally. However, we find that the effectiveness of token dependencies varies across different forecasting scenarios, and existing methods ignore these differences, which affects their performance. This raises two issues: (1) What are effective token dependencies? (2) How can we learn effective dependencies? From a logical perspective, we align Transformer-based TSF methods with the logical framework and define effective token dependencies as those that ensure the tokens as atomic formulas (Issue 1). We then align the learning process of Transformer methods with the process of obtaining atomic formulas in logic, which inspires us to design a method for learning these effective dependencies (Issue 2). Specifically, we propose Attention Logic Regularization (Attn-L-Reg), a plug-and-play method that guides the model to use fewer but more effective dependencies by making the attention map sparse, thereby ensuring the tokens as atomic formulas and improving prediction performance. Extensive experiments and theoretical analysis confirm the effectiveness of Attn-L-Reg.
时间序列预测(TSF)在许多应用中扮演着关键角色。基于Transformer的方法是当前主流的TSF技术之一。现有方法通常认为所有令牌依赖关系都是同等重要的,然而我们发现不同的预测场景下,这些令牌之间的依赖关系的有效性会有所不同,而现有的方法忽视了这一点,这影响了它们的整体性能。这就引发两个问题:(1)什么样的令牌依赖关系是有效的?(2)如何学习这些有效的依赖关系? 从逻辑学的角度出发,我们将基于Transformer的时间序列预测方法与逻辑框架进行了对齐,并定义了确保令牌作为原子公式的有效令牌依赖关系为第一个问题的答案。随后,我们借鉴获取逻辑中原子公式的流程来重新设计Transformer模型的学习过程,从而激发了构建有效依赖关系的方法(回答第二个问题)。具体来说,我们提出了Attention Logic Regularization(Attn-L-Reg),一种即插即用的方法,通过使注意力图变稀疏来指导模型使用更少但更有效的依赖关系,进而确保令牌作为原子公式,并提升预测性能。 广泛的实验和理论分析验证了Attn-L-Reg的有效性。
https://arxiv.org/abs/2503.06867
Traditional artificial neural networks take inspiration from biological networks, using layers of neuron-like nodes to pass information for processing. More realistic models include spiking in the neural network, capturing the electrical characteristics more closely. However, a large proportion of brain cells are of the glial cell type, in particular astrocytes which have been suggested to play a role in performing computations. Here, we introduce a modified spiking neural network model with added astrocyte-like units in a neural network and asses their impact on learning. We implement the network as a liquid state machine and task the network with performing a chaotic time-series prediction task. We varied the number and ratio of neuron-like and astrocyte-like units in the network to examine the latter units effect on learning. We show that the combination of neurons and astrocytes together, as opposed to neural- and astrocyte-only networks, are critical for driving learning. Interestingly, we found that the highest learning rate was achieved when the ratio between astrocyte-like and neuron-like units was roughly 2 to 1, mirroring some estimates of the ratio of biological astrocytes to neurons. Our results demonstrate that incorporating astrocyte-like units which represent information across longer timescales can alter the learning rates of neural networks, and the proportion of astrocytes to neurons should be tuned appropriately to a given task.
传统的人工神经网络从生物网络中汲取灵感,使用类似神经元的节点层层传递信息进行处理。更逼真的模型则会在神经网络中加入脉冲机制,以更好地捕捉其电学特性。然而,大脑细胞中很大一部分是星形胶质细胞等胶质细胞类型,这些细胞被认为在执行计算过程中扮演重要角色。在此,我们引入了一种改进的脉冲神经网络模型,在其中加入了类似星形胶质细胞的单元,并评估了它们对学习的影响。我们将该网络实现为液态机器(Liquid State Machine),并将其任务设定为进行混沌时间序列预测。通过改变网络中类似神经元和类似星形胶质细胞的比例,我们研究了后者单位在学习过程中所起的作用。 我们的结果显示,将神经元与星形胶质细胞结合使用,相较于只含神经元或仅含星形胶质细胞的网络,对于促进学习至关重要。有趣的是,我们发现当类似星形胶质细胞和类似神经元的比例大约为2:1时,学习速率最高,这与生物体中星形胶质细胞到神经元的实际比例估算相符。 我们的结果表明,通过在神经网络中引入可以跨较长时间尺度传递信息的类似星形胶质细胞单元,可以改变其学习速率,并且应当根据特定任务调整星形胶质细胞和神经元的比例。
https://arxiv.org/abs/2503.06798
As the quantities of data recorded by embedded edge sensors grow, so too does the need for intelligent local processing. Such data often comes in the form of time-series signals, based on which real-time predictions can be made locally using an AI model. However, a hardware-software approach capable of making low-latency predictions with low power consumption is required. In this paper, we present a hardware implementation of an event-graph neural network for time-series classification. We leverage an artificial cochlea model to convert the input time-series signals into a sparse event-data format that allows the event-graph to drastically reduce the number of calculations relative to other AI methods. We implemented the design on a SoC FPGA and applied it to the real-time processing of the Spiking Heidelberg Digits (SHD) dataset to benchmark our approach against competitive solutions. Our method achieves a floating-point accuracy of 92.7% on the SHD dataset for the base model, which is only 2.4% and 2% less than the state-of-the-art models with over 10% and 67% fewer model parameters, respectively. It also outperforms FPGA-based spiking neural network implementations by 19.3% and 4.5%, achieving 92.3% accuracy for the quantised model while using fewer computational resources and reducing latency.
随着嵌入式边缘传感器记录的数据量的增加,对智能本地处理的需求也在增长。这些数据通常以时间序列信号的形式出现,可以使用AI模型在本地进行实时预测。然而,需要一种既能实现低延迟预测又能减少功耗的软硬件方法。本文中,我们提出了一种针对时间序列分类的事件图神经网络的硬件实现方案。我们利用人工耳蜗模型将输入的时间序列信号转换为稀疏事件数据格式,使得事件图相比其他AI方法可以大幅减少计算量。我们在SoC FPGA上实现了该设计,并将其应用于Spiking Heidelberg Digits (SHD) 数据集的实时处理以与竞争解决方案进行基准测试。我们的方法在基本模型中达到了92.7% 的浮点精度,仅比最先进的模型低2.4%,而参数数量减少了10%以上;对于另一个先进的模型,尽管参数数量减少67%,但精度仅差2%。此外,在量化模型中实现了92.3%的准确性的同时,我们的方法还通过使用更少的计算资源和降低延迟超过了基于FPGA的脉冲神经网络实现方案19.3% 和4.5%。
https://arxiv.org/abs/2503.06629
Patient-ventilator asynchrony (PVA) is a common and critical issue during mechanical ventilation, affecting up to 85% of patients. PVA can result in clinical complications such as discomfort, sleep disruption, and potentially more severe conditions like ventilator-induced lung injury and diaphragm dysfunction. Traditional PVA management, which relies on manual adjustments by healthcare providers, is often inadequate due to delays and errors. While various computational methods, including rule-based, statistical, and deep learning approaches, have been developed to detect PVA events, they face challenges related to dataset imbalances and lack of interpretability. In this work, we propose a shapelet-based approach SHIP for PVA detection, utilizing shapelets - discriminative subsequences in time-series data - to enhance detection accuracy and interpretability. Our method addresses dataset imbalances through shapelet-based data augmentation and constructs a shapelet pool to transform the dataset for more effective classification. The combined shapelet and statistical features are then used in a classifier to identify PVA events. Experimental results on medical datasets show that SHIP significantly improves PVA detection while providing interpretable insights into model decisions.
患者-呼吸机不同步(PVA)是在机械通气过程中常见的且重要的问题,影响高达85%的患者。PVA可能导致临床并发症,如不适、睡眠中断以及更严重的状况,例如呼吸机引起的肺损伤和膈肌功能障碍。传统的PVA管理依赖医护人员手动调整,但由于延迟和错误,这种方法往往效果不足。尽管已经开发了多种计算方法(包括基于规则的方法、统计方法和深度学习方法)来检测PVA事件,但这些方法仍然面临着数据集不平衡和缺乏可解释性等挑战。 在这项工作中,我们提出了一种基于形变序列的SHIP方法用于PVA检测,该方法利用时间序列数据中的鉴别子序列(即形变序列),以提高检测准确性和可解释性。我们的方法通过基于形变序列的数据增强来解决数据集不平衡问题,并构建一个形变池将原始数据集转换为更适合分类的形式。然后,在分类器中使用结合了形变和统计特征的方法来识别PVA事件。 在医疗数据集上的实验结果显示,SHIP显著提高了对PVA的检测准确性,并提供了有关模型决策的可解释性见解。
https://arxiv.org/abs/2503.06571
Recent advances in clinical AI have enabled remarkable progress across many clinical domains. However, existing benchmarks and models are primarily limited to a small set of modalities and tasks, which hinders the development of large-scale multimodal methods that can make holistic assessments of patient health and well-being. To bridge this gap, we introduce Clinical Large-Scale Integrative Multimodal Benchmark (CLIMB), a comprehensive clinical benchmark unifying diverse clinical data across imaging, language, temporal, and graph modalities. CLIMB comprises 4.51 million patient samples totaling 19.01 terabytes distributed across 2D imaging, 3D video, time series, graphs, and multimodal data. Through extensive empirical evaluation, we demonstrate that multitask pretraining significantly improves performance on understudied domains, achieving up to 29% improvement in ultrasound and 23% in ECG analysis over single-task learning. Pretraining on CLIMB also effectively improves models' generalization capability to new tasks, and strong unimodal encoder performance translates well to multimodal performance when paired with task-appropriate fusion strategies. Our findings provide a foundation for new architecture designs and pretraining strategies to advance clinical AI research. Code is released at this https URL.
最近在临床人工智能方面的进展已经在许多临床领域取得了显著的进步。然而,现有的基准和模型主要局限于有限的模态和任务集上,这阻碍了大规模多模态方法的发展,这些方法能够全面评估患者的健康状况和福祉。为了弥合这一差距,我们引入了一个名为Clinical Large-Scale Integrative Multimodal Benchmark (CLIMB) 的综合性临床基准测试平台,该平台统一了跨成像、语言、时间序列以及图谱等多样化的临床数据。CLIMB 包含总计19.01太字节的451万份患者样本,并分布在二维影像、三维视频、时间序列、图形和多模态数据中。 通过广泛的实证评估,我们证明了在较少研究的领域进行多任务预训练能显著提升性能,在超声波分析上最多可提高29%,心电图(ECG) 分析方面则可达23%。此外,CLIMB 上的预训练还可以有效地增强模型对新任务的泛化能力,并且强大的单一模态编码器表现可以很好地转化为多模态表现,尤其是在与适当的融合策略结合使用时。我们的研究结果为新的架构设计和预训练策略奠定了基础,以推动临床人工智能的研究进展。 代码发布在以下链接:[此处提供具体的URL地址]
https://arxiv.org/abs/2503.07667
This paper gives an overview on how to develop a dense and deep neural network for making a time series prediction. First, the history and cornerstones in Artificial Intelligence and Machine Learning will be presented. After a short introduction to the theory of Artificial Intelligence and Machine Learning, the paper will go deeper into the techniques for conducting a time series prediction with different models of neural networks. For this project, Python's development environment Jupyter, extended with the TensorFlow package and deep-learning application Keras is used. The system setup and project framework are explained in more detail before discussing the time series prediction. The main part shows an applied example of time series prediction with weather data. For this work, a deep recurrent neural network with Long Short-Term Memory cells is used to conduct the time series prediction. The results and evaluation of the work show that a weather prediction with deep neural networks can be successful for a short time period. However, there are some drawbacks and limitations with time series prediction, which will be discussed towards the end of the paper.
本文介绍了如何开发密集型和深层神经网络以进行时间序列预测的方法。首先,将回顾人工智能(AI)和机器学习的历史及其基石。在简要介绍人工智能和机器学习的理论之后,文章将进一步深入探讨使用不同模型的神经网络进行时间序列预测的技术。该项目使用了通过TensorFlow包和深度学习应用Keras扩展的Python开发环境Jupyter。系统设置和项目框架会在讨论时间序列预测之前详细解释。主要内容展示了一个利用天气数据的时间序列预测的实际案例,在这个工作中,使用具有长短期记忆(LSTM)单元的深层循环神经网络进行时间序列预测。结果与工作评价表明,对于较短的时间段来说,使用深度神经网络进行天气预报是可行的。然而,时间序列预测也有一些局限性和缺点,这些将在论文的最后部分加以讨论。
https://arxiv.org/abs/2503.06278
While recent multimodal large language models (MLLMs) have advanced automated ECG interpretation, they still face two key limitations: (1) insufficient multimodal synergy between time series signals and visual ECG representations, and (2) limited explainability in linking diagnoses to granular waveform evidence. We introduce GEM, the first MLLM unifying ECG time series, 12-lead ECG images and text for grounded and clinician-aligned ECG interpretation. GEM enables feature-grounded analysis, evidence-driven reasoning, and a clinician-like diagnostic process through three core innovations: a dual-encoder framework extracting complementary time series and image features, cross-modal alignment for effective multimodal understanding, and knowledge-guided instruction generation for generating high-granularity grounding data (ECG-Grounding) linking diagnoses to measurable parameters ($e.g.$, QRS/PR Intervals). Additionally, we propose the Grounded ECG Understanding task, a clinically motivated benchmark designed to comprehensively assess the MLLM's capability in grounded ECG understanding. Experimental results on both existing and our proposed benchmarks show GEM significantly improves predictive performance (CSN $7.4\% \uparrow$), explainability ($22.7\% \uparrow$), and grounding ($24.8\% \uparrow$), making it more suitable for real-world clinical applications. GitHub repository: this https URL
虽然最近的多模态大型语言模型(MLLMs)在自动心电图解释方面取得了进展,但它们仍然面临两个关键限制:(1) 时间序列信号和视觉心电图表示之间的多模态协同作用不足;(2) 将诊断与具体波形证据相联系时说明性不足。我们引入了GEM,这是第一个统一心电图时间序列、12导联心电图图像和文本的MLLM模型,以实现基于事实且符合临床医生标准的心电图解释。通过三大创新:双编码器框架提取互补的时间序列和图像特征;跨模态对齐实现有效的多模式理解;以及知识引导指令生成用于产生高粒度定位数据(ECG-Grounding),将诊断与可测量参数(如QRS/PR间隔)联系起来,GEM能够进行基于特性的分析、证据驱动的推理,并支持类似临床医生的诊断过程。此外,我们提出了“基于事实的心电图理解任务”,这是一个由临床需求推动的基准测试,旨在全面评估MLLM在心电图解释中的能力。我们在现有和新提出的基准上进行了实验,结果表明GEM显著提高了预测性能(CSN提高7.4%)、说明性(提高22.7%)以及定位(提高24.8%),使其更适合实际临床应用。 GitHub 仓库:[此链接](this https URL)
https://arxiv.org/abs/2503.06073
Finance decision-making often relies on in-depth data analysis across various data sources, including financial tables, news articles, stock prices, etc. In this work, we introduce FinTMMBench, the first comprehensive benchmark for evaluating temporal-aware multi-modal Retrieval-Augmented Generation (RAG) systems in finance. Built from heterologous data of NASDAQ 100 companies, FinTMMBench offers three significant advantages. 1) Multi-modal Corpus: It encompasses a hybrid of financial tables, news articles, daily stock prices, and visual technical charts as the corpus. 2) Temporal-aware Questions: Each question requires the retrieval and interpretation of its relevant data over a specific time period, including daily, weekly, monthly, quarterly, and annual periods. 3) Diverse Financial Analysis Tasks: The questions involve 10 different tasks, including information extraction, trend analysis, sentiment analysis and event detection, etc. We further propose a novel TMMHybridRAG method, which first leverages LLMs to convert data from other modalities (e.g., tabular, visual and time-series data) into textual format and then incorporates temporal information in each node when constructing graphs and dense indexes. Its effectiveness has been validated in extensive experiments, but notable gaps remain, highlighting the challenges presented by our FinTMMBench.
财务决策通常依赖于跨多种数据源的深入数据分析,包括财务表格、新闻文章、股票价格等。在此研究中,我们介绍了FinTMMBench,这是首个用于评估具有时间感知能力的多模态检索增强生成(RAG)系统在金融领域的全面基准测试工具。该基准基于纳斯达克100指数公司的异构数据构建,并提供了三个显著优势: 1) 多模态语料库:它涵盖了财务表格、新闻文章、每日股票价格以及技术图表的混合体作为其语料库。 2) 具有时间感知的问题:每个问题都要求检索并解读特定时间段内的相关数据,包括日度、周度、月度、季度和年度的数据。 3) 多样化的金融分析任务:这些问题涵盖了10种不同的任务类型,包括信息提取、趋势分析、情感分析以及事件检测等。 我们进一步提出了一种新颖的TMMHybridRAG方法。首先利用大语言模型(LLM)将其他模态数据(例如表格形式的数据、视觉数据和时间序列数据)转换为文本格式,然后在构建图谱和密集索引时,在每个节点中整合时间信息。该方法的有效性已在广泛的实验中得到验证,但仍有明显不足之处,这突显了我们FinTMMBench所提出的挑战。
https://arxiv.org/abs/2503.05185
Sea surface temperature (SST) is a fundamental physical parameter characterising the thermal state of sea surface. The Thermal Infrared Sensor (TIRS) onboard Landsat-8, with its 100-meter spatial resolution, offers a unique opportunity to uncover fine-scale coastal SST patterns that would otherwise be overlooked by coarser-resolution thermal sensors. In this study, we first develop an operational approach for SST retrieval from the TIRS sensor, and subsequently propose a novel algorithm for establishing daily SST climatology which serves as the baseline to detect anomalous SST events. We applied the proposed methods to temperate coastal waters in South Australia for the ten-year period from 2014 to 2023. For ground validation purposes, a buoy was deployed off the coast of Port Lincoln, South Australia, to record in-situ time-series SST. The spatiotemporal patterns of SST in the study area were analysed based on the ten years of satellite-derived SST imagery. The daily baseline climatology of SST with 100 m resolution was constructed, which allowed for the detection and analysis of anomalous SST events during the study period of 2014-2023. Our results suggest the following: (1) the satellite-derived SST data, generated with the proposed algorithm, aligned well with the in-situ measured SST values; (2) the semi-enclosed, shallow regions of Upper Spencer Gulf and Upper St Vincent Gulf showed higher temperatures during summer and cooler temperatures during winter than waters closer to the open ocean, resulting in a higher seasonal variation in SST; (3) the near-shore shallow areas in Spencer Gulf and St Vincent Gulf, and regions surrounding Kangaroo Island, were identified to have a higher probability of SST anomalies compared to the rest of the study area; and (4) anomalous SST events were more likely to happen during the warm months than the cool months.
海表温度(SST)是描述海洋表面热状态的基本物理参数。Landsat-8卫星上的热红外传感器(TIRS),以其100米的空间分辨率,为揭示通常被较低分辨率热传感器忽视的近岸精细尺度SST模式提供了独特的机会。在这项研究中,我们首先开发了一种从TIRS传感器获取海表温度的操作方法,并随后提出了一种新颖的算法来建立每日的SST气候学基线,以此作为检测异常SST事件的基础。我们将这些方法应用于澳大利亚南部沿海地区2014年至2023年的十年期间的研究。 为了地面验证目的,在南澳大利亚波特林коль海岸外部署了一个浮标,记录了实时海表温度序列。根据卫星获取的十年SST影像,我们分析了研究区域内的时空变化模式。构建了每日基线气候学(分辨率为100米),从而允许在2014年至2023年期间识别和分析异常SST事件。 我们的结果表明: (1) 用所提出的算法生成的卫星获取的海表温度数据,与实地测量的数据高度一致; (2) 半封闭、浅水区域(如斯宾塞海湾上部和圣文森特湾上部),夏季时比靠近开放海洋的地方更暖,冬季则相反,因此该区域表现出更高的季节性SST变化; (3) 斯宾塞海湾和圣文森特海湾的近岸浅区以及袋鼠岛周围地区,在研究区域内显示出发生海表温度异常事件的可能性更高; (4) 异常SST事件更可能在温暖月份而不是寒冷月份出现。
https://arxiv.org/abs/2503.05843
Spiking Neural Networks (SNNs) offer a promising, biologically inspired approach for processing spatiotemporal data, particularly for time series forecasting. However, conventional neuron models like the Leaky Integrate-and-Fire (LIF) struggle to capture long-term dependencies and effectively process multi-scale temporal dynamics. To overcome these limitations, we introduce the Temporal Segment Leaky Integrate-and-Fire (TS-LIF) model, featuring a novel dual-compartment architecture. The dendritic and somatic compartments specialize in capturing distinct frequency components, providing functional heterogeneity that enhances the neuron's ability to process both low- and high-frequency information. Furthermore, the newly introduced direct somatic current injection reduces information loss during intra-neuronal transmission, while dendritic spike generation improves multi-scale information extraction. We provide a theoretical stability analysis of the TS-LIF model and explain how each compartment contributes to distinct frequency response characteristics. Experimental results show that TS-LIF outperforms traditional SNNs in time series forecasting, demonstrating better accuracy and robustness, even with missing data. TS-LIF advances the application of SNNs in time-series forecasting, providing a biologically inspired approach that captures complex temporal dynamics and offers potential for practical implementation in diverse forecasting scenarios. The source code is available at this https URL.
脉冲神经网络(SNN)提供了一种有前景的、受生物启发的方法,用于处理时空数据,特别是在时间序列预测方面。然而,传统的神经元模型如漏积分发放(LIF)难以捕捉长期依赖关系,并且无法有效处理多尺度的时间动态变化。为克服这些局限性,我们引入了时序段漏积分发放(TS-LIF)模型,该模型具有新颖的双隔室架构。树突和胞体隔室分别专注于捕获不同的频率成分,提供了功能异质性,从而增强了神经元处理低频和高频信息的能力。此外,新引入的直接胞体电流注入减少了细胞内传递过程中的信息损失,而树突发放则提高了多尺度信息提取能力。我们对TS-LIF模型进行了理论稳定性分析,并解释了每个隔室如何贡献不同的频率响应特性。实验结果表明,TS-LIF在时间序列预测中优于传统SNN,即使在数据缺失的情况下也表现出更高的准确性和鲁棒性。TS-LIF推进了SNN在时间序列预测中的应用,提供了一种能够捕捉复杂时间动态的生物启发式方法,并为各种预测场景的实际实施提供了潜在可能。源代码可在该网址获取:[此链接](请将"this https URL"替换为实际链接)。
https://arxiv.org/abs/2503.05108
Recently, Large Language Models (LLMs) and Foundation Models (FMs) have become prevalent for time series forecasting tasks. However, fine-tuning large language models (LLMs) for forecasting enables the adaptation to specific domains but may not generalize well across diverse, unseen datasets. Meanwhile, existing time series foundation models (TSFMs) lack inherent mechanisms for domain adaptation and suffer from limited interpretability, making them suboptimal for zero-shot forecasting. To this end, we present TS-RAG, a retrieval-augmented generation based time series forecasting framework that enhances the generalization capability and interpretability of TSFMs. Specifically, TS-RAG leverages pre-trained time series encoders to retrieve semantically relevant time series segments from a dedicated knowledge database, incorporating contextual patterns for the given time series query. Next, we develop a learnable Mixture-of-Experts (MoE)-based augmentation module, which dynamically fuses retrieved time series patterns with the TSFM's representation of the input query, improving forecasting accuracy without requiring task-specific fine-tuning. Thorough empirical studies on seven public benchmark datasets demonstrate that TS-RAG achieves state-of-the-art zero-shot forecasting performance, outperforming TSFMs by up to 6.51% across diverse domains and showcasing desired interpretability.
最近,大型语言模型(LLM)和基础模型(FM)在时间序列预测任务中变得越来越流行。然而,针对特定领域的精细调整虽然能够使大语言模型适应具体场景,但可能无法很好地泛化到多样且未见过的数据集上。同时,现有的时间序列基础模型(TSFM)缺乏内在的领域自适应机制,并且具有有限的可解释性,这使得它们在零样本预测任务中表现不佳。为此,我们提出了 TS-RAG,这是一种基于检索增强生成的时间序列预测框架,旨在提升 TSFM 的泛化能力和可解释性。 具体而言,TS-RAG 利用预训练的时间序列编码器从专门的知识库中检索与给定时间序列查询语义相关的片段,并结合上下文模式。接下来,我们开发了一个可学习的专家混合(MoE)增强模块,该模块能够动态地将检索到的时间序列模式与 TSFM 对输入查询的表示相结合,从而提高预测准确性,而无需针对特定任务进行微调。 在七个公共基准数据集上的全面实证研究表明,TS-RAG 达到了最先进的零样本预测性能,在不同领域中优于现有的时间序列基础模型(TSFMs)高达 6.51%,并展示了期望的可解释性。
https://arxiv.org/abs/2503.07649