Abstract
Since acquiring perfect supervision is usually difficult, real-world machine learning tasks often confront inaccurate, incomplete, or inexact supervision, collectively referred to as weak supervision. In this work, we present WSAUC, a unified framework for weakly supervised AUC optimization problems, which covers noisy label learning, positive-unlabeled learning, multi-instance learning, and semi-supervised learning scenarios. Within the WSAUC framework, we first frame the AUC optimization problems in various weakly supervised scenarios as a common formulation of minimizing the AUC risk on contaminated sets, and demonstrate that the empirical risk minimization problems are consistent with the true AUC. Then, we introduce a new type of partial AUC, specifically, the reversed partial AUC (rpAUC), which serves as a robust training objective for AUC maximization in the presence of contaminated labels. WSAUC offers a universal solution for AUC optimization in various weakly supervised scenarios by maximizing the empirical rpAUC. Theoretical and experimental results under multiple settings support the effectiveness of WSAUC on a range of weakly supervised AUC optimization tasks.
Abstract (translated)
由于获得完美的监督通常很难,现实世界中的机器学习任务经常面临不准确、不完整或精确监督的情况,统称为较弱监督。在本文中,我们提出了WSAUC,一个统一框架,用于处理较弱监督的AUC优化问题,涵盖了噪声标签学习、阳性无标签学习、多实例学习和半监督学习场景。在WSAUC框架中,我们首先将各种较弱监督场景中的AUC优化问题作为 common formulation,最小化污染 sets 中的AUC风险,并证明实际风险最小化问题与真正的AUC 是一致的。然后,我们引入了一种新的 partial AUC 类型,特别是反转 partialAUC(rpAUC),它在污染标签存在的情况下作为AUC最大化的稳健训练目标。WSAUC 通过最大化实际 rpAUC 在所有较弱监督场景中提供了AUC优化的通用解决方案。在多个设置下的理论和实验结果支持了 WSAUC 对一系列较弱监督的AUC优化任务的有效性。
URL
https://arxiv.org/abs/2305.14258