Paper Reading AI Learner

EIoU-EMC: A Novel Loss for Domain-specific Nested Entity Recognition

2025-04-19 06:31:54
Jian Zhang, Tianqing Zhang, Qi Li, Hongwei Wang

Abstract

In recent years, research has mainly focused on the general NER task. There still have some challenges with nested NER task in the specific domains. Specifically, the scenarios of low resource and class imbalance impede the wide application for biomedical and industrial domains. In this study, we design a novel loss EIoU-EMC, by enhancing the implement of Intersection over Union loss and Multiclass loss. Our proposed method specially leverages the information of entity boundary and entity classification, thereby enhancing the model's capacity to learn from a limited number of data samples. To validate the performance of this innovative method in enhancing NER task, we conducted experiments on three distinct biomedical NER datasets and one dataset constructed by ourselves from industrial complex equipment maintenance documents. Comparing to strong baselines, our method demonstrates the competitive performance across all datasets. During the experimental analysis, our proposed method exhibits significant advancements in entity boundary recognition and entity classification. Our code are available here.

Abstract (translated)

近年来,研究主要集中在通用命名实体识别(NER)任务上。但在特定领域中,嵌套NER任务仍然面临一些挑战,特别是在资源匮乏和类别不平衡的情况下,这对生物医学和工业领域的广泛应用构成了障碍。为此,在这项研究中,我们设计了一种新颖的损失函数EIoU-EMC,通过增强交并比(Intersection over Union, IoU)损失和多类损失的实施来实现这一目标。我们的方法特别利用了实体边界信息和实体分类信息,从而增强了模型从有限数据样本中学习的能力。 为了验证该创新方法在提升NER任务性能方面的效果,我们针对三个不同的生物医学NER数据集以及一个由我们自己构建自工业复杂设备维护文档的数据集进行了实验。与强基线相比,我们的方法在这所有数据集中都显示出了竞争力的性能。在实验分析过程中,所提出的方法在实体边界识别和实体分类上表现出显著的进步。 我们的代码可在此处获取(此处应提供具体链接或访问方式)。

URL

https://arxiv.org/abs/2504.14203

PDF

https://arxiv.org/pdf/2504.14203.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot