Abstract
Citation Text Generation (CTG) is a task in natural language processing (NLP) that aims to produce text that accurately cites or references a cited document within a source document. In CTG, the generated text draws upon contextual cues from both the source document and the cited paper, ensuring accurate and relevant citation information is provided. Previous work in the field of citation generation is mainly based on the text summarization of documents. Following this, this paper presents a framework, and a comparative study to demonstrate the use of Large Language Models (LLMs) for the task of citation generation. Also, we have shown the improvement in the results of citation generation by incorporating the knowledge graph relations of the papers in the prompt for the LLM to better learn the relationship between the papers. To assess how well our model is performing, we have used a subset of standard S2ORC dataset, which only consists of computer science academic research papers in the English Language. Vicuna performs best for this task with 14.15 Meteor, 12.88 Rouge-1, 1.52 Rouge-2, and 10.94 Rouge-L. Also, Alpaca performs best, and improves the performance by 36.98% in Rouge-1, and 33.14% in Meteor by including knowledge graphs.
Abstract (translated)
参考文献文本生成(CTG)是自然语言处理(NLP)领域的一个任务,旨在生成在源文档中准确引用或参考引用文档的文本。在CTG中,生成的文本从源文档和引用的论文的上下文上下文信息中获取上下文提示,确保提供准确和相关的引用信息。该领域之前的工作主要基于文档的文本摘要。本文提出了一个框架和比较研究,以证明大型语言模型(LLMs)在引用生成任务中的应用。我们还通过将LLM的纸张知识图谱关系融入提示中,展示了引用生成结果的改善。为了评估我们的模型表现,我们使用了一个仅包含英语语言计算机科学学术研究论文的标准化S2ORC数据集。Vicuna在任务中表现最佳,达到14.15 Meteor,12.88 Rouge-1,1.52 Rouge-2和10.94 Rouge-L。此外,Alpaca也表现最佳,通过包括知识图谱提高了Rouge-1和Meteor的性能。
URL
https://arxiv.org/abs/2404.09763