LLMs have demonstrated significant potential in quantitative finance by processing vast unstructured data to emulate human-like analytical workflows. However, current LLM-based methods primarily follow either an Asset-Centric paradigm focused on individual stock prediction or a Market-Centric approach for portfolio allocation, often remaining agnostic to the underlying reasoning that drives market movements. In this paper, we propose a Logic-Oriented perspective, modeling the financial market as a dynamic, evolutionary ecosystem of competing investment narratives, termed Modes of Thought. To operationalize this view, we introduce MEME (Modeling the Evolutionary Modes of Financial Markets), designed to reconstruct market dynamics through the lens of evolving logics. MEME employs a multi-agent extraction module to transform noisy data into high-fidelity Investment Arguments and utilizes Gaussian Mixture Modeling to uncover latent consensus within a semantic space. To model semantic drift among different market conditions, we also implement a temporal evaluation and alignment mechanism to track the lifecycle and historical profitability of these modes. By prioritizing enduring market wisdom over transient anomalies, MEME ensures that portfolio construction is guided by robust reasoning. Extensive experiments on three heterogeneous Chinese stock pools from 2023 to 2025 demonstrate that MEME consistently outperforms seven SOTA baselines. Further ablation studies, sensitivity analysis, lifecycle case study and cost analysis validate MEME's capacity to identify and adapt to the evolving consensus of financial markets. Our implementation can be found at this https URL.
大型语言模型(LLMs)在量化金融领域展示了显著的潜力,通过处理大量非结构化数据来模拟类似人类的分析工作流程。然而,目前基于LLM的方法主要遵循两种范式:一种是专注于个股预测的资产中心主义方法;另一种则是用于投资组合配置的市场中心主义方法,两者通常忽视了推动市场变动的根本原因。在本文中,我们提出了一种逻辑导向视角,将金融市场建模为一个动态、进化的竞争性投资叙事生态系统,称为思想模式(Modes of Thought)。为了实现这一观点,我们引入了MEME(Modeling the Evolutionary Modes of Financial Markets),旨在通过不断演化的逻辑来重建市场动态。MEME采用多代理提取模块将嘈杂的数据转换为高保真的投资论据,并使用高斯混合模型在语义空间内揭示潜在的共识。为了模拟不同市场条件下语义漂移,我们还实施了一种时间评估和对齐机制,以跟踪这些模式的生命历程及其历史盈利能力。通过优先考虑持久的市场智慧而非短暂异常,MEME确保投资组合构建由稳健的理由引导。 从2023年到2025年的三个异质中国股票池中进行的大量实验表明,MEME在七种最先进的基准方法上始终表现出色。进一步的消融研究、敏感性分析、生命周期案例研究和成本分析验证了MEME识别并适应金融市场不断演变共识的能力。 我们的实现可以在以下网址找到:[此处插入实际链接]
https://arxiv.org/abs/2602.11918
Extracting signals through alpha factor mining is a fundamental challenge in quantitative finance. Existing automated methods primarily follow two paradigms: Decoupled Factor Generation, which treats factor discovery as isolated events, and Iterative Factor Evolution, which focuses on local parent-child refinements. However, both paradigms lack a global structural view, often treating factor pools as unstructured collections or fragmented chains, which leads to redundant search and limited diversity. To address these limitations, we introduce AlphaPROBE (Alpha Mining via Principled Retrieval and On-graph Biased Evolution), a framework that reframes alpha mining as the strategic navigation of a Directed Acyclic Graph (DAG). By modeling factors as nodes and evolutionary links as edges, AlphaPROBE treats the factor pool as a dynamic, interconnected ecosystem. The framework consists of two core components: a Bayesian Factor Retriever that identifies high-potential seeds by balancing exploitation and exploration through a posterior probability model, and a DAG-aware Factor Generator that leverages the full ancestral trace of factors to produce context-aware, nonredundant optimizations. Extensive experiments on three major Chinese stock market datasets against 8 competitive baselines demonstrate that AlphaPROBE significantly gains enhanced performance in predictive accuracy, return stability and training efficiency. Our results confirm that leveraging global evolutionary topology is essential for efficient and robust automated alpha discovery. We have open-sourced our implementation at this https URL.
通过阿尔法因子挖掘提取信号是量化金融中的一个基本挑战。现有的自动化方法主要遵循两种范式:解耦因素生成,这种方法将因子发现视为孤立事件;以及迭代因素进化,侧重于局部的父子层次细化。然而,这两种范式都缺乏全局结构视角,往往将因子池视作无结构集合或碎片化链条,导致冗余搜索和多样性受限。 为了克服这些限制,我们引入了AlphaPROBE(通过原则性检索和图上偏置演化进行阿尔法挖掘),这是一个框架,它重新定义阿尔法挖掘为有向无环图(DAG)的战略导航。AlphaPROBE将因子视为节点,并将进化链接视作边,从而将因子池视为一个动态的、相互关联的生态系统。该框架由两个核心组件组成:贝叶斯因子检索器,通过后验概率模型平衡利用和探索来识别高潜力种子;以及DAG感知型因子生成器,它利用因素的完整先祖追踪以产生上下文相关且非冗余优化。 在三个主要中国股票市场数据集上进行的大量实验表明,AlphaPROBE相较于8个竞争基线,在预测准确性、收益稳定性和训练效率方面显著提升了性能。我们的研究结果证实了借助全局进化拓扑对于有效和鲁棒自动阿尔法发现的重要性。 我们已经开源了此实现,请访问[此处](https://URL)查看。
https://arxiv.org/abs/2602.11917
In quantitative finance, the gap between training and real-world performance-driven by concept drift and distributional non-stationarity-remains a critical obstacle for building reliable data-driven systems. Models trained on static historical data often overfit, resulting in poor generalization in dynamic markets. The mantra "History Is Not Enough" underscores the need for adaptive data generation that learns to evolve with the market rather than relying solely on past observations. We present a drift-aware dataflow system that integrates machine learning-based adaptive control into the data curation process. The system couples a parameterized data manipulation module comprising single-stock transformations, multi-stock mix-ups, and curation operations, with an adaptive planner-scheduler that employs gradient-based bi-level optimization to control the system. This design unifies data augmentation, curriculum learning, and data workflow management under a single differentiable framework, enabling provenance-aware replay and continuous data quality monitoring. Extensive experiments on forecasting and reinforcement learning trading tasks demonstrate that our framework enhances model robustness and improves risk-adjusted returns. The system provides a generalizable approach to adaptive data management and learning-guided workflow automation for financial data.
在量化金融领域,由于概念漂移(concept drift)和分布非平稳性(distributional non-stationarity),训练数据与实际世界性能之间的差距仍然是构建可靠的数据驱动系统的关键障碍。基于静态历史数据进行训练的模型往往过度拟合,在动态市场中表现不佳。口号“历史不够”强调了需要自适应数据生成,以学习随着市场变化而演变,而不是仅仅依赖于过去的观察结果。 我们提出了一种概念漂移感知的数据流系统,该系统将机器学习基础的自适应控制集成到了数据管理过程中。该系统结合了一个参数化的数据操作模块(包括单股票转换、多股票混合和数据管理操作)与一个采用基于梯度的双层优化方法进行自我调节的规划调度器。这种设计统一了数据增强、课程学习以及数据工作流管理在一个单一可微分框架内,使得来源追踪感知重放和持续的数据质量监控成为可能。 在预测任务和强化学习交易任务上的广泛实验表明,我们的框架能够提升模型鲁棒性并改善风险调整后的回报率。该系统为适应性数据管理和由学习引导的工作流程自动化提供了通用的方法论,适用于金融数据处理。
https://arxiv.org/abs/2601.10143
Large Language Models (LLMs) have shown strong capabilities across many domains, yet their evaluation in financial quantitative tasks remains fragmented and mostly limited to knowledge-centric question answering. We introduce QuantEval, a benchmark that evaluates LLMs across three essential dimensions of quantitative finance: knowledge-based QA, quantitative mathematical reasoning, and quantitative strategy coding. Unlike prior financial benchmarks, QuantEval integrates a CTA-style backtesting framework that executes model-generated strategies and evaluates them using financial performance metrics, enabling a more realistic assessment of quantitative coding ability. We evaluate some state-of-the-art open-source and proprietary LLMs and observe substantial gaps to human experts, particularly in reasoning and strategy coding. Finally, we conduct large-scale supervised fine-tuning and reinforcement learning experiments on domain-aligned data, demonstrating consistent improvements. We hope QuantEval will facilitate research on LLMs' quantitative finance capabilities and accelerate their practical adoption in real-world trading workflows. We additionally release the full deterministic backtesting configuration (asset universe, cost model, and metric definitions) to ensure strict reproducibility.
大型语言模型(LLMs)在多个领域展现了强大的能力,然而它们在金融量化任务中的评估仍然碎片化,并且主要局限于知识为中心的问题回答。我们引入了QuantEval基准测试,它从定量金融的三个方面来评价LLMs:基于知识的问答、数量化的数学推理以及量化的策略编码。 与之前的财务基准不同,QuantEval整合了一个CTA风格的回测框架,该框架可以执行模型生成的策略,并使用财务绩效指标进行评估,从而能够更真实地衡量量化代码编写能力。我们对一些最先进的开源和专有LLMs进行了评价,观察到在推理和策略编码方面与人类专家存在显著差距。 最后,我们在领域内对齐的数据上进行了大规模监督微调和强化学习实验,显示出了持续的改进效果。我们希望QuantEval能促进对LLMs量化金融能力的研究,并加速它们在现实世界交易工作流程中的实际应用。此外,为了确保严格的可重复性,我们将完整的确定性回测配置(资产组合、成本模型及指标定义)一并发布。
https://arxiv.org/abs/2601.08689
This paper investigates how Large Language Models (LLMs) from leading providers (OpenAI, Google, Anthropic, DeepSeek, and xAI) can be applied to quantitative sector-based portfolio construction. We use LLMs to identify investable universes of stocks within S&P 500 sector indices and evaluate how their selections perform when combined with classical portfolio optimization methods. Each model was prompted to select and weight 20 stocks per sector, and the resulting portfolios were compared with their respective sector indices across two distinct out-of-sample periods: a stable market phase (January-March 2025) and a volatile phase (April-June 2025). Our results reveal a strong temporal dependence in LLM portfolio performance. During stable market conditions, LLM-weighted portfolios frequently outperformed sector indices on both cumulative return and risk-adjusted (Sharpe ratio) measures. However, during the volatile period, many LLM portfolios underperformed, suggesting that current models may struggle to adapt to regime shifts or high-volatility environments underrepresented in their training data. Importantly, when LLM-based stock selection is combined with traditional optimization techniques, portfolio outcomes improve in both performance and consistency. This study contributes one of the first multi-model, cross-provider evaluations of generative AI algorithms in investment management. It highlights that while LLMs can effectively complement quantitative finance by enhancing stock selection and interpretability, their reliability remains market-dependent. The findings underscore the potential of hybrid AI-quantitative frameworks, integrating LLM reasoning with established optimization techniques, to produce more robust and adaptive investment strategies.
本文研究了领先供应商(OpenAI、Google、Anthropic、DeepSeek 和 xAI)提供的大型语言模型(LLMs)在基于量化行业的投资组合构建中的应用。我们使用这些模型来识别标普500指数中各行业成分股的投资范围,并评估它们的选择与经典投资组合理论方法结合后的表现。每个模型被提示选择并加权各行业中20只股票,然后我们将生成的投资组合与其他同类市场指数在两个不同的样本外时间段进行了比较:一个稳定的市场时期(2025年1月至3月)和一个动荡的市场时期(2025年4月至6月)。我们的研究结果揭示了LLM投资组合绩效具有明显的时变特性。在稳定市场的条件下,通过累积回报和风险调整后收益(夏普比率)衡量,LLM加权的投资组合常常优于行业指数表现。然而,在动荡的市场期间,许多由LLM构建的投资组合的表现不佳,这表明当前模型可能难以适应其训练数据中代表性不足的制度转变或高波动性环境。值得注意的是,当基于LLM的选择与传统优化技术相结合时,投资组合在性能和一致性方面都有所提升。 这项研究提供了对生成式AI算法在资产管理中的多模态、跨供应商评估的一个早期示例。研究表明,虽然LLMs可以通过增强选股能力和可解释性有效地补充量化金融,但其可靠性仍然依赖于市场条件。这些发现强调了混合AI-量化框架的潜力,即结合LLM推理和成熟的优化技术来生成更加稳健且适应性强的投资策略。
https://arxiv.org/abs/2512.24526
Recent advances in large language models (LLMs) are transforming data-intensive domains, with finance representing a high-stakes environment where transparent and reproducible analysis of heterogeneous signals is essential. Traditional quantitative methods remain vulnerable to survivorship bias, while many AI-driven approaches struggle with signal integration, reproducibility, and computational efficiency. We introduce MASFIN, a modular multi-agent framework that integrates LLMs with structured financial metrics and unstructured news, while embedding explicit bias-mitigation protocols. The system leverages GPT-4.1-nano for reproducability and cost-efficient inference and generates weekly portfolios of 15-30 equities with allocation weights optimized for short-term performance. In an eight-week evaluation, MASFIN delivered a 7.33% cumulative return, outperforming the S&P 500, NASDAQ-100, and Dow Jones benchmarks in six of eight weeks, albeit with higher volatility. These findings demonstrate the promise of bias-aware, generative AI frameworks for financial forecasting and highlight opportunities for modular multi-agent design to advance practical, transparent, and reproducible approaches in quantitative finance.
最近在大型语言模型(LLMs)方面取得的进展正在改变数据密集型领域,尤其是在金融行业这样一个高风险环境中,透明且可重复地分析异构信号至关重要。传统的量化方法仍然容易受到幸存者偏差的影响,而许多基于AI的方法则难以整合信号、确保可重复性和提高计算效率。我们推出了MASFIN,这是一个模块化的多代理框架,它将LLMs与结构化金融指标和非结构化新闻相结合,并嵌入了明确的偏见缓解协议。该系统利用GPT-4.1-nano来实现可重复性并进行成本效益高的推理,生成包含15至30个股票的每周投资组合,其分配权重经过优化以提高短期表现。在为期八周的评估中,MASFIN实现了7.33%的累计收益,在八个周期中有六次超过了标准普尔500指数、纳斯达克-100和道琼斯基准的表现,尽管波动性较高。这些发现展示了具有偏见意识的生成式AI框架在金融预测中的潜力,并强调了模块化多代理设计在量化金融中推进实用、透明和可重复方法的机会。
https://arxiv.org/abs/2512.21878
Synthetic financial data offers a practical way to address the privacy and accessibility challenges that limit research in quantitative finance. This paper examines the use of generative models, in particular TimeGAN and Variational Autoencoders (VAEs), for creating synthetic return series that support portfolio construction, trading analysis, and risk modeling. Using historical daily returns from the S and P 500 as a benchmark, we generate synthetic datasets under comparable market conditions and evaluate them using statistical similarity metrics, temporal structure tests, and downstream financial tasks. The study shows that TimeGAN produces synthetic data with distributional shapes, volatility patterns, and autocorrelation behaviour that are close to those observed in real returns. When applied to mean-variance portfolio optimization, the resulting synthetic datasets lead to portfolio weights, Sharpe ratios, and risk levels that remain close to those obtained from real data. The VAE provides more stable training but tends to smooth extreme market movements, which affects risk estimation. Finally, the analysis supports the use of synthetic datasets as substitutes for real financial data in portfolio analysis and risk simulation, particularly when models are able to capture temporal dynamics. Synthetic data therefore provides a privacy-preserving, cost-effective, and reproducible tool for financial experimentation and model development.
合成金融数据为解决量化金融研究中隐私和可访问性限制提供了实际途径。本文探讨了生成模型(尤其是TimeGAN和变分自编码器(VAEs))在创建支持投资组合构建、交易分析和风险建模的合成回报序列方面的应用。以标普500的历史每日收益为基准,我们生成了符合相似市场条件的合成数据集,并通过统计相似度指标、时间结构测试以及下游金融任务对其进行了评估。 研究表明,TimeGAN能够产生与实际回报观察到的分布形状、波动模式和自相关行为非常接近的合成数据。在均值-方差投资组合优化中应用这些合成数据后,生成的投资组合权重、夏普比率及风险水平仍然与使用真实数据所得的结果相近。相比之下,VAE提供更稳定的训练过程,但倾向于平滑极端市场变动,这影响了风险估计。 最终分析表明,在能够捕捉时间动态特性的模型下,可以将合成数据集作为实际金融数据的替代品用于投资组合分析和风险模拟中。因此,合成数据为金融实验与模型开发提供了隐私保护、成本效益以及可重复使用的工具。
https://arxiv.org/abs/2512.21798
Robust asset allocation is a key challenge in quantitative finance, where deep-learning forecasters often fail due to objective mismatch and error amplification. We introduce the Signature-Informed Transformer (SIT), a novel framework that learns end-to-end allocation policies by directly optimizing a risk-aware financial objective. SIT's core innovations include path signatures for a rich geometric representation of asset dynamics and a signature-augmented attention mechanism embedding financial inductive biases, like lead-lag effects, into the model. Evaluated on daily S\&P 100 equity data, SIT decisively outperforms traditional and deep-learning baselines, especially when compared to predict-then-optimize models. These results indicate that portfolio-aware objectives and geometry-aware inductive biases are essential for risk-aware capital allocation in machine-learning systems. The code is available at: this https URL
稳健的资产配置是量化金融中的一个关键挑战,深度学习预测器常常由于目标不匹配和错误放大而失效。我们引入了签名信息Transformer(SIT),这是一种新颖的框架,通过直接优化风险意识的财务目标来端到端地学习资产配置策略。SIT的核心创新包括用于丰富几何表示资产动态路径签名以及将如领先-滞后效应等金融归纳偏差嵌入模型中的签名增强注意力机制。 在对每日S&P 100股票数据进行评估时,SIT显著优于传统的和基于深度学习的基准方法,尤其是在与预测然后优化模型相比时。这些结果表明,在机器学习系统中,针对投资组合的目标意识以及几何感知的归纳偏差对于风险认知资本配置至关重要。 代码可在以下链接获取:[此链接](this https URL)
https://arxiv.org/abs/2510.03129
Generative modeling of high-frequency limit order book (LOB) dynamics is a critical yet unsolved challenge in quantitative finance, essential for robust market simulation and strategy backtesting. Existing approaches are often constrained by simplifying stochastic assumptions or, in the case of modern deep learning models like Transformers, rely on tokenization schemes that affect the high-precision, numerical nature of financial data through discretization and binning. To address these limitations, we introduce ByteGen, a novel generative model that operates directly on the raw byte streams of LOB events. Our approach treats the problem as an autoregressive next-byte prediction task, for which we design a compact and efficient 32-byte packed binary format to represent market messages without information loss. The core novelty of our work is the complete elimination of feature engineering and tokenization, enabling the model to learn market dynamics from its most fundamental representation. We achieve this by adapting the H-Net architecture, a hybrid Mamba-Transformer model that uses a dynamic chunking mechanism to discover the inherent structure of market messages without predefined rules. Our primary contributions are: 1) the first end-to-end, byte-level framework for LOB modeling; 2) an efficient packed data representation; and 3) a comprehensive evaluation on high-frequency data. Trained on over 34 million events from CME Bitcoin futures, ByteGen successfully reproduces key stylized facts of financial markets, generating realistic price distributions, heavy-tailed returns, and bursty event timing. Our findings demonstrate that learning directly from byte space is a promising and highly flexible paradigm for modeling complex financial systems, achieving competitive performance on standard market quality metrics without the biases of tokenization.
高频限价订单簿(LOB)动态的生成建模是定量金融中的一个关键但尚未解决的挑战,对于稳健的市场模拟和策略回测至关重要。现有的方法通常受限于简化随机假设或依赖于现代深度学习模型(如Transformer)使用的分词方案,这些方案通过离散化和分类影响了金融市场数据高精度数值特性。为了解决这些问题,我们引入了一个新颖的生成模型ByteGen,该模型直接在LOB事件的原始字节流上操作。我们的方法将问题视为一个自回归的下一个字节预测任务,并为此设计了一种紧凑且高效的32字节打包二进制格式来表示市场消息而不丢失信息。我们工作的核心创新在于完全消除了特征工程和分词,使模型能够从最基本的形式中学习市场动态。通过适应H-Net架构(一种混合Mamba-Transformer模型),该模型采用了一种动态切片机制,在没有预定义规则的情况下发现市场的内在结构,从而实现了这一点。 我们的主要贡献包括:1)第一个端到端、字节级的LOB建模框架;2)一种高效的打包数据表示方式;3)在高频数据上的全面评估。ByteGen使用来自CME比特币期货的超过3400万个事件进行训练,并成功再现了金融市场的关键统计特征,生成了现实的价格分布、尾部重的回报以及突发性事件时间间隔。 我们的发现表明,直接从字节空间学习是一种有前景且高度灵活的方法来建模复杂的金融市场系统,在标准市场质量指标上实现了与分词方法相比无偏差的竞争性能。
https://arxiv.org/abs/2508.02247
Synthetic time series are essential tools for data augmentation, stress testing, and algorithmic prototyping in quantitative finance. However, in cryptocurrency markets, characterized by 24/7 trading, extreme volatility, and rapid regime shifts, existing Time Series Generation (TSG) methods and benchmarks often fall short, jeopardizing practical utility. Most prior work (1) targets non-financial or traditional financial domains, (2) focuses narrowly on classification and forecasting while neglecting crypto-specific complexities, and (3) lacks critical financial evaluations, particularly for trading applications. To address these gaps, we introduce \textsf{CTBench}, the first comprehensive TSG benchmark tailored for the cryptocurrency domain. \textsf{CTBench} curates an open-source dataset from 452 tokens and evaluates TSG models across 13 metrics spanning 5 key dimensions: forecasting accuracy, rank fidelity, trading performance, risk assessment, and computational efficiency. A key innovation is a dual-task evaluation framework: (1) the \emph{Predictive Utility} task measures how well synthetic data preserves temporal and cross-sectional patterns for forecasting, while (2) the \emph{Statistical Arbitrage} task assesses whether reconstructed series support mean-reverting signals for trading. We benchmark eight representative models from five methodological families over four distinct market regimes, uncovering trade-offs between statistical fidelity and real-world profitability. Notably, \textsf{CTBench} offers model ranking analysis and actionable guidance for selecting and deploying TSG models in crypto analytics and strategy development.
合成时间序列是数据增强、压力测试和算法原型开发在量化金融中的重要工具。然而,在加密货币市场,其特点是24/7交易、极端波动性和快速的市场变化下,现有的时间序列生成(TSG)方法和基准往往无法满足需求,这削弱了其实用性。大多数之前的工作要么针对非金融或传统金融市场,要么仅聚焦于分类和预测而忽视了加密货币市场的特定复杂性,再者缺乏关键性的财务评估,尤其是对于交易应用的评估。 为了解决这些不足,我们引入了\textsf{CTBench}——首个专门面向加密货币领域的时间序列生成基准。该基准基于来自452种代币的开源数据集,并从13项指标对TSG模型进行评价,涵盖了五个关键维度:预测准确性、排名保真度、交易表现、风险评估和计算效率。其中一项创新在于双任务评估框架: - \emph{Predictive Utility}(预测效用)任务衡量合成数据在保留时间序列和横截面模式方面的效果。 - \emph{Statistical Arbitrage}(统计套利)任务评估重构的时间序列是否支持用于交易的均值回复信号。 我们对来自五种方法学派系的八个代表性模型进行了四个不同市场环境下的基准测试,揭示了统计保真度与现实世界盈利能力之间的权衡。特别地,\textsf{CTBench}提供了模型排名分析和在加密货币分析及策略开发中选择和部署TSG模型的实际指导。
https://arxiv.org/abs/2508.02758
Financial markets pose fundamental challenges for asset return prediction due to their high dimensionality, non-stationarity, and persistent volatility. Despite advances in large language models and multi-agent systems, current quantitative research pipelines suffer from limited automation, weak interpretability, and fragmented coordination across key components such as factor mining and model innovation. In this paper, we propose R&D-Agent for Quantitative Finance, in short RD-Agent(Q), the first data-centric multi-agent framework designed to automate the full-stack research and development of quantitative strategies via coordinated factor-model co-optimization. RD-Agent(Q) decomposes the quant process into two iterative stages: a Research stage that dynamically sets goal-aligned prompts, formulates hypotheses based on domain priors, and maps them to concrete tasks, and a Development stage that employs a code-generation agent, Co-STEER, to implement task-specific code, which is then executed in real-market backtests. The two stages are connected through a feedback stage that thoroughly evaluates experimental outcomes and informs subsequent iterations, with a multi-armed bandit scheduler for adaptive direction selection. Empirically, RD-Agent(Q) achieves up to 2X higher annualized returns than classical factor libraries using 70% fewer factors, and outperforms state-of-the-art deep time-series models on real markets. Its joint factor-model optimization delivers a strong balance between predictive accuracy and strategy robustness. Our code is available at: this https URL.
金融市场在资产回报预测方面提出了根本性的挑战,这些挑战源于市场的高维度、非平稳性和持续的波动性。尽管大型语言模型和多代理系统有所进步,但目前的数量化研究流程仍然存在自动化程度有限、解释能力弱以及关键组成部分(如因子挖掘和模型创新)之间的协调碎片化等问题。在这篇论文中,我们提出了“定量金融研发代理”(R&D-Agent for Quantitative Finance),简称RD-Agent(Q),这是首个以数据为中心的多代理框架,旨在通过协同优化因子-模型来自动完成数量化策略的全流程研究与开发。 RD-Agent(Q)将量化过程分解为两个迭代阶段:**研究阶段(Research stage)**,该阶段动态地设置目标对齐提示、基于领域先验构建假设并将其映射到具体任务;以及 **开发阶段(Development stage)**,这一阶段利用代码生成代理Co-STEER来实现特定的任务代码,并在实际市场回测中执行这些代码。两个阶段通过一个反馈阶段连接起来,在这个阶段里对实验结果进行全面评估,并为后续迭代提供信息,同时使用多臂赌博机调度器进行适应性方向选择。 从经验上讲,RD-Agent(Q)实现了比经典因子库高出2倍的年化回报率,且只用了70%的因素数量。此外,它在实际市场上超过了现有的最先进的深度时间序列模型性能。其联合优化因子-模型的方法能够提供预测准确性和策略稳健性之间的良好平衡。 我们的代码可以在以下链接找到:[此URL](this https URL)。
https://arxiv.org/abs/2505.15155
The stock market, as a cornerstone of the financial markets, places forecasting stock price movements at the forefront of challenges in quantitative finance. Emerging learning-based approaches have made significant progress in capturing the intricate and ever-evolving data patterns of modern markets. With the rapid expansion of the stock market, it presents two characteristics, i.e., stock exogeneity and volatility heterogeneity, that heighten the complexity of price forecasting. Specifically, while stock exogeneity reflects the influence of external market factors on price movements, volatility heterogeneity showcases the varying difficulty in movement forecasting against price fluctuations. In this work, we introduce the framework of Cross-market Synergy with Pseudo-volatility Optimization (CSPO). Specifically, CSPO implements an effective deep neural architecture to leverage external futures knowledge. This enriches stock embeddings with cross-market insights and thus enhances the CSPO's predictive capability. Furthermore, CSPO incorporates pseudo-volatility to model stock-specific forecasting confidence, enabling a dynamic adaptation of its optimization process to improve accuracy and robustness. Our extensive experiments, encompassing industrial evaluation and public benchmarking, highlight CSPO's superior performance over existing methods and effectiveness of all proposed modules contained therein.
股市作为金融市场的重要基石,将预测股价变动视为数量金融领域的主要挑战之一。基于学习的方法在捕捉现代市场复杂且不断变化的数据模式方面取得了显著进展。随着股市的迅速扩张,它呈现出两个特性:即股票外生性和波动性异质性,这增加了价格预测的复杂度。具体而言,股票外生性反映了外部市场因素对股价变动的影响,而波动性异质性则展示了在面对不同价格波动时进行预测难度的不同。 在此研究中,我们提出了跨市场协同伪波动优化(Cross-market Synergy with Pseudo-volatility Optimization, CSPO)框架。具体来说,CSPO 实现了一种有效的深度神经网络架构来利用外部期货知识,这丰富了股票嵌入信息并融入了跨市场的见解,从而增强了 CSPO 的预测能力。此外,CSPO 还采用伪波动率建模特定股票的预测信心水平,使其优化过程能够根据实际情况动态调整以提高准确性和鲁棒性。 我们进行了广泛的实验,包括工业评估和公共基准测试,结果表明与现有方法相比,CSPO 在性能上具有显著优势,并证实了其内部所有模块的有效性。
https://arxiv.org/abs/2503.22740
Reinforcement Learning (RL) has experienced significant advancement over the past decade, prompting a growing interest in applications within finance. This survey critically evaluates 167 publications, exploring diverse RL applications and frameworks in finance. Financial markets, marked by their complexity, multi-agent nature, information asymmetry, and inherent randomness, serve as an intriguing test-bed for RL. Traditional finance offers certain solutions, and RL advances these with a more dynamic approach, incorporating machine learning methods, including transfer learning, meta-learning, and multi-agent solutions. This survey dissects key RL components through the lens of Quantitative Finance. We uncover emerging themes, propose areas for future research, and critique the strengths and weaknesses of existing methods.
强化学习(RL)在过去的十年里取得了显著的进展,这引起了对金融领域应用的浓厚兴趣。这项调查对167篇论文进行了审查,探讨了金融领域中多种RL应用和框架。金融市场以其复杂性、多代理性、信息不对称性和固有随机性而闻名,成为RL的一个有趣的实验平台。传统金融提供了一些解决方案,RL以更动态的方法推动这些解决方案,包括机器学习方法,包括迁移学习、元学习和支持性学习。通过量化金融的视角,我们剖析了RL的关键组成部分。我们发现了新兴的主题,提出了未来的研究方向,并批判了现有方法的优缺点。
https://arxiv.org/abs/2408.10932
Exploring complex adaptive financial trading environments through multi-agent based simulation methods presents an innovative approach within the realm of quantitative finance. Despite the dominance of multi-agent reinforcement learning approaches in financial markets with observable data, there exists a set of systematically significant financial markets that pose challenges due to their partial or obscured data availability. We, therefore, devise a multi-agent simulation approach employing small-scale meta-heuristic methods. This approach aims to represent the opaque bilateral market for Australian government bond trading, capturing the bilateral nature of bank-to-bank trading, also referred to as "over-the-counter" (OTC) trading, and commonly occurring between "market makers". The uniqueness of the bilateral market, characterized by negotiated transactions and a limited number of agents, yields valuable insights for agent-based modelling and quantitative finance. The inherent rigidity of this market structure, which is at odds with the global proliferation of multilateral platforms and the decentralization of finance, underscores the unique insights offered by our agent-based model. We explore the implications of market rigidity on market structure and consider the element of stability, in market design. This extends the ongoing discourse on complex financial trading environments, providing an enhanced understanding of their dynamics and implications.
通过基于多智能体(multi-agent)的仿真方法探索复杂适应金融交易环境是一种在量化金融领域具有创新性的方法。尽管在具有观测数据的市场中,多智能体强化学习方法占据主导地位,但存在一组由于部分或难以获得数据而具有系统性地重要性的金融市场。因此,我们设计了一种基于元启发式方法的多智能体仿真方法。该方法旨在代表澳大利亚政府债券交易的双边市场,捕捉到银行间交易的双边性质,也称为“场外”(OTC) 交易,以及通常在市场制造商之间发生的双边交易。双边市场的独特性,其特点是有协议的交易和有限的代理数量,为基于智能体的建模和量化金融提供了宝贵的见解。市场结构的固有刚性,与其与全球多边平台和金融市场的分散化相矛盾,强调了我们的基于智能体的模型所提供的独特见解。我们探讨了市场刚性对市场结构和市场设计的影响。这扩展了关于复杂金融交易环境的持续讨论,提供了对它们动态和影响的更深入了解。
https://arxiv.org/abs/2405.02849
This research paper delves into the application of Deep Reinforcement Learning (DRL) in asset-class agnostic portfolio optimization, integrating industry-grade methodologies with quantitative finance. At the heart of this integration is our robust framework that not only merges advanced DRL algorithms with modern computational techniques but also emphasizes stringent statistical analysis, software engineering and regulatory compliance. To the best of our knowledge, this is the first study integrating financial Reinforcement Learning with sim-to-real methodologies from robotics and mathematical physics, thus enriching our frameworks and arguments with this unique perspective. Our research culminates with the introduction of AlphaOptimizerNet, a proprietary Reinforcement Learning agent (and corresponding library). Developed from a synthesis of state-of-the-art (SOTA) literature and our unique interdisciplinary methodology, AlphaOptimizerNet demonstrates encouraging risk-return optimization across various asset classes with realistic constraints. These preliminary results underscore the practical efficacy of our frameworks. As the finance sector increasingly gravitates towards advanced algorithmic solutions, our study bridges theoretical advancements with real-world applicability, offering a template for ensuring safety and robust standards in this technologically driven future.
本文深入研究了在资产类别无关的组合优化中应用深度强化学习(DRL)的方法,将行业级别的方法和量化金融相结合。这一整合的核心是我们的稳健框架,不仅将先进的DRL算法与现代计算技术相结合,而且强调了严格的统计分析、软件工程和法规合规性。据我们所知,这是第一个将金融强化学习与机器人学和数学物理中的模拟到现实方法相结合的研究,从而丰富了我们框架和论点的独特视角。我们的研究最后引入了AlphaOptimizerNet,一种专有强化学习代理(相应库)。作为最先进的文献综述和独特跨学科方法的结果,AlphaOptimizerNet在各种资产类别的风险收益优化方面表现出鼓舞人心的效果。这些初步结果强调了我们在框架中的实际有效性。随着金融部门越来越倾向于采用先进的人工智能解决方案,我们的研究将理论进步与现实应用相结合,为在技术驱动的未来确保安全和稳健标准提供了模板。
https://arxiv.org/abs/2403.07916
Recent advancements in large language models (LLMs) have opened new pathways for many domains. However, the full potential of LLMs in financial investments remains largely untapped. There are two main challenges for typical deep learning-based methods for quantitative finance. First, they struggle to fuse textual and numerical information flexibly for stock movement prediction. Second, traditional methods lack clarity and interpretability, which impedes their application in scenarios where the justification for predictions is essential. To solve the above challenges, we propose Ploutos, a novel financial LLM framework that consists of PloutosGen and PloutosGPT. The PloutosGen contains multiple primary experts that can analyze different modal data, such as text and numbers, and provide quantitative strategies from different perspectives. Then PloutosGPT combines their insights and predictions and generates interpretable rationales. To generate accurate and faithful rationales, the training strategy of PloutosGPT leverage rearview-mirror prompting mechanism to guide GPT-4 to generate rationales, and a dynamic token weighting mechanism to finetune LLM by increasing key tokens weight. Extensive experiments show our framework outperforms the state-of-the-art methods on both prediction accuracy and interpretability.
近年来,在大型语言模型(LLMs)领域的发展为许多领域带来了新的途径。然而,LLMs在金融投资领域的全部潜力仍然没有被充分发掘。对于典型的深度学习为基础的量化金融方法,有两种主要挑战。首先,它们在将文本和数值信息灵活融合以进行股票运动预测方面遇到困难。其次,传统方法缺乏清晰度和可解释性,这阻碍了它们在需要预测正当性的场景中的应用。为解决上述挑战,我们提出了Ploutos,一种新型的金融LLM框架,由PloutosGen和PloutosGPT组成。PloutosGen包含多个专家,可以从文本和数值等多种数据形式中分析数据,并提供不同角度的定量策略。然后,PloutosGPT结合它们的见解和预测,生成可解释的合理性。为了生成准确和忠实的合理性,PloutosGPT的训练策略利用了后视镜提示机制来指导GPT-4生成合理性,以及动态词重置机制,通过增加关键单词权重来微调LLM。大量实验证明,我们的框架在预测准确性和可解释性方面都优于最先进的方法。
https://arxiv.org/abs/2403.00782
Deep reinforcement learning (DRL) has revolutionized quantitative finance by achieving excellent performance without significant manual effort. Whereas we observe that the DRL models behave unstably in a dynamic stock market due to the low signal-to-noise ratio nature of the financial data. In this paper, we propose a novel logic-guided trading framework, termed as SYENS (Program Synthesis-based Ensemble Strategy). Different from the previous state-of-the-art ensemble reinforcement learning strategy which arbitrarily selects the best-performing agent for testing based on a single measurement, our framework proposes regularizing the model's behavior in a hierarchical manner using the program synthesis by sketching paradigm. First, we propose a high-level, domain-specific language (DSL) that is used for the depiction of the market environment and action. Then based on the DSL, a novel program sketch is introduced, which embeds human expert knowledge in a logical manner. Finally, based on the program sketch, we adopt the program synthesis by sketching a paradigm and synthesizing a logical, hierarchical trading strategy. We evaluate SYENS on the 30 Dow Jones stocks under the cash trading and the margin trading settings. Experimental results demonstrate that our proposed framework can significantly outperform the baselines with much higher cumulative return and lower maximum drawdown under both settings.
深度强化学习(DRL)通过实现无需大量手动努力的优秀性能,极大地推动了量化金融的发展。然而,我们观察到,由于金融数据信号与噪声比低,动态股票市场中的DRL模型表现不稳定。在本文中,我们提出了一个新颖的基于逻辑的指导交易框架,称为SYENS(基于程序合成的主导策略)。与之前的状态级强化学习策略不同,该框架通过绘制范式对模型的行为进行层次化规范。首先,我们提出了一个高级、领域特定的语言(DSL),用于描述市场环境和动作。然后基于DSL,我们引入了一个新颖的程序草图,以直观地表示人类专家知识。最后,基于程序草图,我们采用基于绘图范式进行程序合成,并合成一个逻辑分层交易策略。我们在现金交易和保证金交易设置下对30只道琼斯股票进行了对SYENS的评估。实验结果表明,与基线相比,我们的框架具有更高的累计回报和较低的最大回撤,尤其是在设置下。
https://arxiv.org/abs/2310.05551
One of the problems in quantitative finance that has received the most attention is the portfolio optimization problem. Regarding its solving, this problem has been approached using different techniques, with those related to quantum computing being especially prolific in recent years. In this study, we present a system called Quantum Computing-based System for Portfolio Optimization with Future Asset Values and Automatic Universe Reduction (Q4FuturePOP), which deals with the Portfolio Optimization Problem considering the following innovations: i) the developed tool is modeled for working with future prediction of assets, instead of historical values; and ii) Q4FuturePOP includes an automatic universe reduction module, which is conceived to intelligently reduce the complexity of the problem. We also introduce a brief discussion about the preliminary performance of the different modules that compose the prototypical version of Q4FuturePOP.
在量化金融中,最受关注的问题之一是投资组合优化问题。关于如何解决这一问题,已经采用了多种技术,与量子计算相关的技术尤为活跃。在本研究中,我们介绍了一个系统,称为基于量子计算的投资组合优化系统,包括未来资产价值自动宇宙减少(Q4FuturePOP)。该系统处理了投资组合优化问题,考虑了以下创新:第一,开发工具是建模用于处理未来资产预测,而不是历史价值;第二,Q4FuturePOP包括一个自动宇宙减少模块,旨在 intelligently 减少问题的复杂性。我们还介绍了关于组成Q4FuturePOP的典型版本不同模块的初步性能的简要讨论。
https://arxiv.org/abs/2309.12627
We present a new financial domain large language model, InvestLM, tuned on LLaMA-65B (Touvron et al., 2023), using a carefully curated instruction dataset related to financial investment. Inspired by less-is-more-for-alignment (Zhou et al., 2023), we manually curate a small yet diverse instruction dataset, covering a wide range of financial related topics, from Chartered Financial Analyst (CFA) exam questions to SEC filings to Stackexchange quantitative finance discussions. InvestLM shows strong capabilities in understanding financial text and provides helpful responses to investment related questions. Financial experts, including hedge fund managers and research analysts, rate InvestLM's response as comparable to those of state-of-the-art commercial models (GPT-3.5, GPT-4 and Claude-2). Zero-shot evaluation on a set of financial NLP benchmarks demonstrates strong generalizability. From a research perspective, this work suggests that a high-quality domain specific LLM can be tuned using a small set of carefully curated instructions on a well-trained foundation model, which is consistent with the Superficial Alignment Hypothesis (Zhou et al., 2023). From a practical perspective, this work develops a state-of-the-art financial domain LLM with superior capability in understanding financial texts and providing helpful investment advice, potentially enhancing the work efficiency of financial professionals. We release the model parameters to the research community.
我们提出了一个新的金融 domain 大型语言模型,InvesLM,通过调整 LLaMA-65B(Touvron等人,2023)上与金融投资相关的精心 curated 指令 dataset 而成。受“少即是多”(Zhou等人,2023)启发,我们手动创建了一份小型但多样化的指令 dataset,涵盖了广泛的金融相关主题,包括CFA 考试问题、SEC 文件、Stackexchange quantitative finance 讨论等。InvesLM 在理解金融文本和回答与投资相关的问题方面表现出强大的能力。金融专家,包括对冲基金经理和研究分析师,将 InvestLM 的回答与最先进的商业模型(GPT-3.5、GPT-4和Claude-2)进行比较。在一项金融 NLP 基准任务的零样本评估中,表现出了强大的通用性。从研究的角度来看,这项工作表明,通过使用一支小型但精心 curated 的指令 dataset 并在受过良好训练的基础模型上调试,可以开发出高质量的金融 domain 特定的 LLM,这与“表面对齐假设”(Zhou等人,2023)是一致的。从实践的角度来看,这项工作开发了最先进的金融 domain LLM,在理解金融文本和提供有用的投资建议方面表现出卓越的能力,可能提高金融专业人士的工作效率。我们将模型参数向研究社区发布。
https://arxiv.org/abs/2309.13064
Order execution is a fundamental task in quantitative finance, aiming at finishing acquisition or liquidation for a number of trading orders of the specific assets. Recent advance in model-free reinforcement learning (RL) provides a data-driven solution to the order execution problem. However, the existing works always optimize execution for an individual order, overlooking the practice that multiple orders are specified to execute simultaneously, resulting in suboptimality and bias. In this paper, we first present a multi-agent RL (MARL) method for multi-order execution considering practical constraints. Specifically, we treat every agent as an individual operator to trade one specific order, while keeping communicating with each other and collaborating for maximizing the overall profits. Nevertheless, the existing MARL algorithms often incorporate communication among agents by exchanging only the information of their partial observations, which is inefficient in complicated financial market. To improve collaboration, we then propose a learnable multi-round communication protocol, for the agents communicating the intended actions with each other and refining accordingly. It is optimized through a novel action value attribution method which is provably consistent with the original learning objective yet more efficient. The experiments on the data from two real-world markets have illustrated superior performance with significantly better collaboration effectiveness achieved by our method.
订单执行是量化金融中的一项基本任务,旨在完成对特定资产的一些交易订单的 acquisition 或 liquidation。最近在无模型强化学习(RL)方面的进展为订单执行问题提供了数据驱动的解决方案。然而,现有的工作总是优化单个订单的执行,忽略了多个订单被指定同时执行的现实情况,导致最优化和偏见。在本文中,我们首先提出了考虑实际约束条件的多Agent RL(MARL)方法,以执行多个订单。具体来说,我们将所有 Agent 视为单个交易员,执行一个特定的订单,同时与其他 Agent 保持沟通和协作,以最大化整体利润。尽管如此,现有的 MARL 算法往往通过仅交换其部分观察信息来集成 Agent 之间的通信,这在复杂的金融市场中效率低下。为了改善协作,我们随后提出了可学习多轮通信协议,以使 Agent 之间相互通信并相应地改进。它通过一种新的行为价值归因方法优化,该方法显然与原始学习目标保持一致,但更高效。从两个实际市场的数据实验可以看出,我们的方法取得了更好的表现,协作效果 significantly better。
https://arxiv.org/abs/2307.03119