Paper Reading AI Learner

History Is Not Enough: An Adaptive Dataflow System for Financial Time-Series Synthesis

2026-01-15 07:38:59
Haochong Xia, Yao Long Teng, Regan Tan, Molei Qin, Xinrun Wang, Bo An

Abstract

In quantitative finance, the gap between training and real-world performance-driven by concept drift and distributional non-stationarity-remains a critical obstacle for building reliable data-driven systems. Models trained on static historical data often overfit, resulting in poor generalization in dynamic markets. The mantra "History Is Not Enough" underscores the need for adaptive data generation that learns to evolve with the market rather than relying solely on past observations. We present a drift-aware dataflow system that integrates machine learning-based adaptive control into the data curation process. The system couples a parameterized data manipulation module comprising single-stock transformations, multi-stock mix-ups, and curation operations, with an adaptive planner-scheduler that employs gradient-based bi-level optimization to control the system. This design unifies data augmentation, curriculum learning, and data workflow management under a single differentiable framework, enabling provenance-aware replay and continuous data quality monitoring. Extensive experiments on forecasting and reinforcement learning trading tasks demonstrate that our framework enhances model robustness and improves risk-adjusted returns. The system provides a generalizable approach to adaptive data management and learning-guided workflow automation for financial data.

Abstract (translated)

在量化金融领域,由于概念漂移(concept drift)和分布非平稳性(distributional non-stationarity),训练数据与实际世界性能之间的差距仍然是构建可靠的数据驱动系统的关键障碍。基于静态历史数据进行训练的模型往往过度拟合,在动态市场中表现不佳。口号“历史不够”强调了需要自适应数据生成,以学习随着市场变化而演变,而不是仅仅依赖于过去的观察结果。 我们提出了一种概念漂移感知的数据流系统,该系统将机器学习基础的自适应控制集成到了数据管理过程中。该系统结合了一个参数化的数据操作模块(包括单股票转换、多股票混合和数据管理操作)与一个采用基于梯度的双层优化方法进行自我调节的规划调度器。这种设计统一了数据增强、课程学习以及数据工作流管理在一个单一可微分框架内,使得来源追踪感知重放和持续的数据质量监控成为可能。 在预测任务和强化学习交易任务上的广泛实验表明,我们的框架能够提升模型鲁棒性并改善风险调整后的回报率。该系统为适应性数据管理和由学习引导的工作流程自动化提供了通用的方法论,适用于金融数据处理。

URL

https://arxiv.org/abs/2601.10143

PDF

https://arxiv.org/pdf/2601.10143.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot