Abstract
To address the challenges of long-tailed classification, researchers have proposed several approaches to reduce model bias, most of which assume that classes with few samples are weak classes. However, recent studies have shown that tail classes are not always hard to learn, and model bias has been observed on sample-balanced datasets, suggesting the existence of other factors that affect model bias. In this work, we systematically propose a series of geometric measurements for perceptual manifolds in deep neural networks, and then explore the effect of the geometric characteristics of perceptual manifolds on classification difficulty and how learning shapes the geometric characteristics of perceptual manifolds. An unanticipated finding is that the correlation between the class accuracy and the separation degree of perceptual manifolds gradually decreases during training, while the negative correlation with the curvature gradually increases, implying that curvature imbalance leads to model bias. Therefore, we propose curvature regularization to facilitate the model to learn curvature-balanced and flatter perceptual manifolds. Evaluations on multiple long-tailed and non-long-tailed datasets show the excellent performance and exciting generality of our approach, especially in achieving significant performance improvements based on current state-of-the-art techniques. Our work opens up a geometric analysis perspective on model bias and reminds researchers to pay attention to model bias on non-long-tailed and even sample-balanced datasets. The code and model will be made public.
Abstract (translated)
为了解决长尾巴分类面临的挑战,研究人员提出了几种方法来减少模型偏差,其中大多数方法都假设只有样本较少的类别是弱类别。然而,最近的研究表明,尾部类别并不一定很难学习,而且在样本平衡的数据集上观察到模型偏差,这表明存在影响模型偏差的其他因素。在本文中,我们系统地提出了在深度神经网络中的感知子空间几何测量,然后探索感知子空间几何特征对分类困难的影响了,以及学习如何塑造感知子空间几何特征。一个出乎意料的发现是,在训练期间,类准确性与感知子空间的分离程度之间的相关度逐渐下降,而与曲率的负相关度逐渐增加,意味着曲率不平衡会导致模型偏差。因此,我们提出了曲率正则化,以促进模型学习曲率平衡和更平坦的感知子空间。多个长尾巴和非长尾巴数据集的评估表明,我们的方法表现出卓越的性能和令人兴奋的灵活性,特别是在基于当前先进技术的方法中实现显著的性能改进。我们的工作打开了模型偏差的几何分析视角,并提醒研究人员注意非长尾巴数据和甚至样本平衡数据集上的模型偏差。代码和模型将公开发布。
URL
https://arxiv.org/abs/2303.12307