Abstract
Learning an effective representation in multi-label text classification (MLTC) is a significant challenge in NLP. This challenge arises from the inherent complexity of the task, which is shaped by two key factors: the intricate connections between labels and the widespread long-tailed distribution of the data. To overcome this issue, one potential approach involves integrating supervised contrastive learning with classical supervised loss functions. Although contrastive learning has shown remarkable performance in multi-class classification, its impact in the multi-label framework has not been thoroughly investigated. In this paper, we conduct an in-depth study of supervised contrastive learning and its influence on representation in MLTC context. We emphasize the importance of considering long-tailed data distributions to build a robust representation space, which effectively addresses two critical challenges associated with contrastive learning that we identify: the "lack of positives" and the "attraction-repulsion imbalance". Building on this insight, we introduce a novel contrastive loss function for MLTC. It attains Micro-F1 scores that either match or surpass those obtained with other frequently employed loss functions, and demonstrates a significant improvement in Macro-F1 scores across three multi-label datasets.
Abstract (translated)
在多标签文本分类(MLTC)中学习有效的表示是一个重要的挑战。这个挑战来自于任务的复杂性,这是由标签之间复杂的联系和数据中普遍的长尾分布两个关键因素塑造的。要解决这个问题,一种潜在的方法是将有监督的对比学习与经典的监督损失函数集成起来。尽管在多标签分类中对比学习表现出显著的性能,但其在MLTC框架中的影响尚未得到充分调查。在本文中,我们深入研究了有监督的对比学习及其在MLTC上下文中的影响。我们强调了考虑长尾数据分布对于构建稳健的表示空间的重要性,这有效地解决了我们确定的对比学习中的两个关键问题:“缺乏正例”和“吸引-排斥不平衡”。基于这个洞见,我们引入了一种新的MLTC损失函数。它在其他常用损失函数中要么匹配,要么超越,并表明在三个多标签数据集上宏观F1得分显著提高。
URL
https://arxiv.org/abs/2404.08720