Abstract
We present a new financial domain large language model, InvestLM, tuned on LLaMA-65B (Touvron et al., 2023), using a carefully curated instruction dataset related to financial investment. Inspired by less-is-more-for-alignment (Zhou et al., 2023), we manually curate a small yet diverse instruction dataset, covering a wide range of financial related topics, from Chartered Financial Analyst (CFA) exam questions to SEC filings to Stackexchange quantitative finance discussions. InvestLM shows strong capabilities in understanding financial text and provides helpful responses to investment related questions. Financial experts, including hedge fund managers and research analysts, rate InvestLM's response as comparable to those of state-of-the-art commercial models (GPT-3.5, GPT-4 and Claude-2). Zero-shot evaluation on a set of financial NLP benchmarks demonstrates strong generalizability. From a research perspective, this work suggests that a high-quality domain specific LLM can be tuned using a small set of carefully curated instructions on a well-trained foundation model, which is consistent with the Superficial Alignment Hypothesis (Zhou et al., 2023). From a practical perspective, this work develops a state-of-the-art financial domain LLM with superior capability in understanding financial texts and providing helpful investment advice, potentially enhancing the work efficiency of financial professionals. We release the model parameters to the research community.
Abstract (translated)
我们提出了一个新的金融 domain 大型语言模型,InvesLM,通过调整 LLaMA-65B(Touvron等人,2023)上与金融投资相关的精心 curated 指令 dataset 而成。受“少即是多”(Zhou等人,2023)启发,我们手动创建了一份小型但多样化的指令 dataset,涵盖了广泛的金融相关主题,包括CFA 考试问题、SEC 文件、Stackexchange quantitative finance 讨论等。InvesLM 在理解金融文本和回答与投资相关的问题方面表现出强大的能力。金融专家,包括对冲基金经理和研究分析师,将 InvestLM 的回答与最先进的商业模型(GPT-3.5、GPT-4和Claude-2)进行比较。在一项金融 NLP 基准任务的零样本评估中,表现出了强大的通用性。从研究的角度来看,这项工作表明,通过使用一支小型但精心 curated 的指令 dataset 并在受过良好训练的基础模型上调试,可以开发出高质量的金融 domain 特定的 LLM,这与“表面对齐假设”(Zhou等人,2023)是一致的。从实践的角度来看,这项工作开发了最先进的金融 domain LLM,在理解金融文本和提供有用的投资建议方面表现出卓越的能力,可能提高金融专业人士的工作效率。我们将模型参数向研究社区发布。
URL
https://arxiv.org/abs/2309.13064