Abstract
Recent advances in large language models (LLMs) are transforming data-intensive domains, with finance representing a high-stakes environment where transparent and reproducible analysis of heterogeneous signals is essential. Traditional quantitative methods remain vulnerable to survivorship bias, while many AI-driven approaches struggle with signal integration, reproducibility, and computational efficiency. We introduce MASFIN, a modular multi-agent framework that integrates LLMs with structured financial metrics and unstructured news, while embedding explicit bias-mitigation protocols. The system leverages GPT-4.1-nano for reproducability and cost-efficient inference and generates weekly portfolios of 15-30 equities with allocation weights optimized for short-term performance. In an eight-week evaluation, MASFIN delivered a 7.33% cumulative return, outperforming the S&P 500, NASDAQ-100, and Dow Jones benchmarks in six of eight weeks, albeit with higher volatility. These findings demonstrate the promise of bias-aware, generative AI frameworks for financial forecasting and highlight opportunities for modular multi-agent design to advance practical, transparent, and reproducible approaches in quantitative finance.
Abstract (translated)
最近在大型语言模型(LLMs)方面取得的进展正在改变数据密集型领域,尤其是在金融行业这样一个高风险环境中,透明且可重复地分析异构信号至关重要。传统的量化方法仍然容易受到幸存者偏差的影响,而许多基于AI的方法则难以整合信号、确保可重复性和提高计算效率。我们推出了MASFIN,这是一个模块化的多代理框架,它将LLMs与结构化金融指标和非结构化新闻相结合,并嵌入了明确的偏见缓解协议。该系统利用GPT-4.1-nano来实现可重复性并进行成本效益高的推理,生成包含15至30个股票的每周投资组合,其分配权重经过优化以提高短期表现。在为期八周的评估中,MASFIN实现了7.33%的累计收益,在八个周期中有六次超过了标准普尔500指数、纳斯达克-100和道琼斯基准的表现,尽管波动性较高。这些发现展示了具有偏见意识的生成式AI框架在金融预测中的潜力,并强调了模块化多代理设计在量化金融中推进实用、透明和可重复方法的机会。
URL
https://arxiv.org/abs/2512.21878