One of the problems in quantitative finance that has received the most attention is the portfolio optimization problem. Regarding its solving, this problem has been approached using different techniques, with those related to quantum computing being especially prolific in recent years. In this study, we present a system called Quantum Computing-based System for Portfolio Optimization with Future Asset Values and Automatic Universe Reduction (Q4FuturePOP), which deals with the Portfolio Optimization Problem considering the following innovations: i) the developed tool is modeled for working with future prediction of assets, instead of historical values; and ii) Q4FuturePOP includes an automatic universe reduction module, which is conceived to intelligently reduce the complexity of the problem. We also introduce a brief discussion about the preliminary performance of the different modules that compose the prototypical version of Q4FuturePOP.
在量化金融中,最受关注的问题之一是投资组合优化问题。关于如何解决这一问题,已经采用了多种技术,与量子计算相关的技术尤为活跃。在本研究中,我们介绍了一个系统,称为基于量子计算的投资组合优化系统,包括未来资产价值自动宇宙减少(Q4FuturePOP)。该系统处理了投资组合优化问题,考虑了以下创新:第一,开发工具是建模用于处理未来资产预测,而不是历史价值;第二,Q4FuturePOP包括一个自动宇宙减少模块,旨在 intelligently 减少问题的复杂性。我们还介绍了关于组成Q4FuturePOP的典型版本不同模块的初步性能的简要讨论。
https://arxiv.org/abs/2309.12627
Order execution is a fundamental task in quantitative finance, aiming at finishing acquisition or liquidation for a number of trading orders of the specific assets. Recent advance in model-free reinforcement learning (RL) provides a data-driven solution to the order execution problem. However, the existing works always optimize execution for an individual order, overlooking the practice that multiple orders are specified to execute simultaneously, resulting in suboptimality and bias. In this paper, we first present a multi-agent RL (MARL) method for multi-order execution considering practical constraints. Specifically, we treat every agent as an individual operator to trade one specific order, while keeping communicating with each other and collaborating for maximizing the overall profits. Nevertheless, the existing MARL algorithms often incorporate communication among agents by exchanging only the information of their partial observations, which is inefficient in complicated financial market. To improve collaboration, we then propose a learnable multi-round communication protocol, for the agents communicating the intended actions with each other and refining accordingly. It is optimized through a novel action value attribution method which is provably consistent with the original learning objective yet more efficient. The experiments on the data from two real-world markets have illustrated superior performance with significantly better collaboration effectiveness achieved by our method.
订单执行是量化金融中的一项基本任务,旨在完成对特定资产的一些交易订单的 acquisition 或 liquidation。最近在无模型强化学习(RL)方面的进展为订单执行问题提供了数据驱动的解决方案。然而,现有的工作总是优化单个订单的执行,忽略了多个订单被指定同时执行的现实情况,导致最优化和偏见。在本文中,我们首先提出了考虑实际约束条件的多Agent RL(MARL)方法,以执行多个订单。具体来说,我们将所有 Agent 视为单个交易员,执行一个特定的订单,同时与其他 Agent 保持沟通和协作,以最大化整体利润。尽管如此,现有的 MARL 算法往往通过仅交换其部分观察信息来集成 Agent 之间的通信,这在复杂的金融市场中效率低下。为了改善协作,我们随后提出了可学习多轮通信协议,以使 Agent 之间相互通信并相应地改进。它通过一种新的行为价值归因方法优化,该方法显然与原始学习目标保持一致,但更高效。从两个实际市场的数据实验可以看出,我们的方法取得了更好的表现,协作效果 significantly better。
https://arxiv.org/abs/2307.03119
One of the most fundamental questions in quantitative finance is the existence of continuous-time diffusion models that fit market prices of a given set of options. Traditionally, one employs a mix of intuition, theoretical and empirical analysis to find models that achieve exact or approximate fits. Our contribution is to show how a suitable game theoretical formulation of this problem can help solve this question by leveraging existing developments in modern deep multi-agent reinforcement learning to search in the space of stochastic processes. More importantly, we hope that our techniques can be leveraged and extended by the community to solve important problems in that field, such as the joint SPX-VIX calibration problem. Our experiments show that we are able to learn local volatility, as well as path-dependence required in the volatility process to minimize the price of a Bermudan option. In one sentence, our algorithm can be seen as a particle method à la Guyon et Henry-Labordere where particles, instead of being designed to ensure $\sigma_{loc}(t,S_t)^2 = \mathbb{E}[\sigma_t^2|S_t]$, are learning RL-driven agents cooperating towards more general calibration targets. This is the first work bridging reinforcement learning with the derivative calibration problem.
https://arxiv.org/abs/2203.06865
Market regimes is a popular topic in quantitative finance even though there is little consensus on the details of how they should be defined. They arise as a feature both in financial market prediction problems and financial market task performing problems. In this work we use discrete event time multi-agent market simulation to freely experiment in a reproducible and understandable environment where regimes can be explicitly switched and enforced. We introduce a novel stochastic process to model the fundamental value perceived by market participants: Continuous-Time Markov Switching Trending Ornstein-Uhlenbeck (CTMSTOU), which facilitates the study of trading policies in regime switching markets. We define the notion of regime-awareness for a trading agent as well and illustrate its importance through the study of different order placement strategies in the context of order execution problems.
https://arxiv.org/abs/2202.00941