Abstract
State-of-the-art models for relation extraction (RE) in the biomedical domain consider finetuning BioBERT using classification, but they may suffer from the anisotropy problem. Contrastive learning methods can reduce this anisotropy phenomena, and also help to avoid class collapse in any classification problem. In the present paper, a new training method called biological non-contrastive relation extraction (BioNCERE) is introduced for relation extraction without using any named entity labels for training to reduce annotation costs. BioNCERE uses transfer learning and non-contrastive learning to avoid full or dimensional collapse as well as bypass overfitting. It resolves RE in three stages by leveraging transfer learning two times. By freezing the weights learned in previous stages in the proposed pipeline and by leveraging non-contrastive learning in the second stage, the model predicts relations without any knowledge of named entities. Experiments have been done on SemMedDB that are almost similar to State-of-the-art performance on RE without using the information of named entities.
Abstract (translated)
最新的生物医学领域关系抽取(RE)模型采用对BioBERT进行微调的分类方法,但可能会遭受各向异性问题的影响。对比学习方法能够减少这种各向异性现象,并且有助于避免任何分类任务中的类别崩溃。在本文中,提出了一种新的训练方法,称为生物非对比关系抽取(BioNCERE),它用于无需使用任何命名实体标签进行训练的关系抽取,以降低标注成本。BioNCERE采用迁移学习和非对比学习来避免完全或维度崩溃,并绕过过拟合问题。该模型通过利用两次迁移学习在三个阶段中解决RE问题。通过冻结前一阶段所学权重并利用第二阶段的非对比学习,在没有命名实体知识的情况下预测关系。实验结果表明,BioNCERE在SemMedDB上的表现几乎与最先进的RE性能相当,而且未使用任何命名实体信息。
URL
https://arxiv.org/abs/2410.23583