Abstract
Using Large Language Models for complex mathematical reasoning is difficult, primarily due to the complexity of multi-step reasoning. The main challenges of this process include (1) selecting critical intermediate results to advance the procedure, and (2) limited exploration of potential solutions. To address these issues, we introduce a novel algorithm, namely Stepwise Self-Consistent Chain-of-Thought (SSC-CoT). SSC-CoT employs a strategy of selecting intermediate steps based on the intersection of various reasoning chains. Additionally, SSC-CoT enables the model to discover critical intermediate steps by querying a knowledge graph comprising relevant domain knowledge. To validate SSC-CoT, we present a new dataset, TriMaster100, tailored for complex trigonometry problems. This dataset contains 100 questions, with each solution broken down into scored intermediate steps, facilitating a comprehensive evaluation of the mathematical reasoning process. On TriMaster100, SSC-CoT triples the effectiveness of the state-of-the-art methods. Furthermore, we benchmark SSC-CoT on the widely recognized complex mathematical question dataset, MATH level 5, and it surpasses the second-best method by 7.2% in accuracy. Code and the TriMaster100 dataset can be found at: this https URL.
Abstract (translated)
使用大型语言模型进行复杂的数学推理很难,主要原因是多步推理的复杂性。这一过程的主要挑战包括(1)选择关键的中间结果来推动程序,和(2)对潜在解决方案的有限探索。为解决这些问题,我们引入了一种新颖的算法,即逐步自适应链式思维(SSC-CoT)。 SSC-CoT采用基于各种推理链的的中间步骤选择的策略。此外,SSC-CoT使模型能够通过查询包含相关领域知识的知识图来发现关键中间步骤。为了验证SSC-CoT,我们提出了一个新的数据集TriMaster100,专门针对复杂的三角学问题。这个数据集包含100个问题,每个解决方案都分解为得分的中间步骤,这有助于全面评估数学推理过程。在TriMaster100上,SSC-CoT的效果是现有方法的3倍。此外,我们在广为人知的复杂数学问题数据集MATH level 5上对SSC-CoT进行了基准测试,它比第二好的方法高7.2%的准确度。 代码和TriMaster100数据集可以在这个链接找到:https://this URL。
URL
https://arxiv.org/abs/2402.17786