Paper Reading AI Learner

ReCycle: Fast and Efficient Long Time Series Forecasting with Residual Cyclic Transformers

2024-05-06 12:48:34
Arvid Weyrauch, Thomas Steens, Oskar Taubert, Benedikt Hanke, Aslan Eqbal, Ewa Götz, Achim Streit, Markus Götz, Charlotte Debus

Abstract

Transformers have recently gained prominence in long time series forecasting by elevating accuracies in a variety of use cases. Regrettably, in the race for better predictive performance the overhead of model architectures has grown onerous, leading to models with computational demand infeasible for most practical applications. To bridge the gap between high method complexity and realistic computational resources, we introduce the Residual Cyclic Transformer, ReCycle. ReCycle utilizes primary cycle compression to address the computational complexity of the attention mechanism in long time series. By learning residuals from refined smoothing average techniques, ReCycle surpasses state-of-the-art accuracy in a variety of application use cases. The reliable and explainable fallback behavior ensured by simple, yet robust, smoothing average techniques additionally lowers the barrier for user acceptance. At the same time, our approach reduces the run time and energy consumption by more than an order of magnitude, making both training and inference feasible on low-performance, low-power and edge computing devices. Code is available at this https URL

Abstract (translated)

近年来,Transformer 在长时序列预测中因提高各种用例中的准确性而取得了突出地位。然而,为了在预测性能的竞争中获得更好的表现,模型架构的复杂性不断提高,导致大多数实际应用模型具有计算密集型,无法满足实际计算资源的需求。为了弥合高方法复杂性和现实计算资源之间的差距,我们引入了 Residual Cyclic Transformer (ReCycle)。ReCycle 通过主要循环压缩来解决长时序列中注意力机制的计算复杂性。通过从精细平滑平均技术中学习残差,ReCycle 在各种应用用例中超越了最先进的准确率。由简单而强大的平滑平均技术确保的可靠且可解释的退火行为还进一步降低了用户接受度的门槛。同时,我们的方法将运行时间和能源消耗降低了 orders of magnitude,使得在低性能、低功率和边缘计算设备上训练和推理都成为可能。代码位于此链接:

URL

https://arxiv.org/abs/2405.03429

PDF

https://arxiv.org/pdf/2405.03429.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot