Abstract
Safe and reliable natural language inference is critical for extracting insights from clinical trial reports but poses challenges due to biases in large pre-trained language models. This paper presents a novel data augmentation technique to improve model robustness for biomedical natural language inference in clinical trials. By generating synthetic examples through semantic perturbations and domain-specific vocabulary replacement and adding a new task for numerical and quantitative reasoning, we introduce greater diversity and reduce shortcut learning. Our approach, combined with multi-task learning and the DeBERTa architecture, achieved significant performance gains on the NLI4CT 2024 benchmark compared to the original language models. Ablation studies validate the contribution of each augmentation method in improving robustness. Our best-performing model ranked 12th in terms of faithfulness and 8th in terms of consistency, respectively, out of the 32 participants.
Abstract (translated)
安全和可靠的自然语言推理对于从临床试验报告中提取洞见至关重要,但大型预训练语言模型的偏见导致其具有挑战性。本文提出了一种新的数据增强技术,以提高生物医学自然语言推理在临床试验中的模型稳健性。通过通过语义扰动和领域特定词汇替换生成合成示例,并添加一个新的任务为数值和数量推理,我们引入了更大的多样性和减少了短路学习。我们与多任务学习和DeBERTa架构相结合的方法在NLI4CT 2024基准测试中的性能优于原始语言模型。消融研究证实了每种增强方法在提高稳健性方面的贡献。我们表现最佳模型的准确性和一致性在32个参与者中分别排名第12和第8。
URL
https://arxiv.org/abs/2404.09206