Abstract
We explore a generative relation extraction (RE) pipeline tailored to the study of interactions in the intestinal microbiome, a complex and low-resource biomedical domain. Our method leverages summarization with large language models (LLMs) to refine context before extracting relations via instruction-tuned generation. Preliminary results on a dedicated corpus show that summarization improves generative RE performance by reducing noise and guiding the model. However, BERT-based RE approaches still outperform generative models. This ongoing work demonstrates the potential of generative methods to support the study of specialized domains in low-resources setting.
Abstract (translated)
我们探索了一种针对肠道微生物组相互作用研究的生成式关系抽取(RE)管道,这是一个复杂且资源匮乏的生物医学领域。我们的方法利用大型语言模型(LLMs)进行摘要提炼,在此基础上通过指令调优生成的方式提取关系。在专门构建的数据集上的初步结果显示,摘要提炼能够减少噪音并指导模型,从而提高生成式RE的性能。然而,基于BERT的关系抽取方法仍然优于生成式模型。这项正在进行的工作展示了生成式方法在资源匮乏环境中支持特定领域研究的巨大潜力。
URL
https://arxiv.org/abs/2506.08647