Abstract
In this article, we present the BTransformer18 model, a deep learning architecture designed for multi-label relation extraction in French texts. Our approach combines the contextual representation capabilities of pre-trained language models from the BERT family - such as BERT, RoBERTa, and their French counterparts CamemBERT and FlauBERT - with the power of Transformer encoders to capture long-term dependencies between tokens. Experiments conducted on the dataset from the TextMine'25 challenge show that our model achieves superior performance, particularly when using CamemBERT-Large, with a macro F1 score of 0.654, surpassing the results obtained with FlauBERT-Large. These results demonstrate the effectiveness of our approach for the automatic extraction of complex relations in intelligence reports.
Abstract (translated)
在这篇文章中,我们介绍了BTransformer18模型,这是一种专为从法语文本中提取多标签关系而设计的深度学习架构。我们的方法结合了来自BERT家族(如BERT、RoBERTa及其法国版本CamemBERT和FlauBERT)的预训练语言模型的上下文表示能力,以及变压器编码器捕捉令牌之间长期依赖性的能力。在TextMine'25挑战赛的数据集上进行的实验表明,使用CamemBERT-Large时,我们的模型取得了优异的成绩,宏F1得分为0.654,超过了FlauBERT-Large的结果。这些结果证明了我们方法在智能报告中自动提取复杂关系方面的有效性。
URL
https://arxiv.org/abs/2502.15619