Abstract
Bias in machine learning models can lead to unfair decision making, and while it has been well-studied in the image and text domains, it remains underexplored in action recognition. Action recognition models often suffer from background bias (i.e., inferring actions based on background cues) and foreground bias (i.e., relying on subject appearance), which can be detrimental to real-life applications such as autonomous vehicles or assisted living monitoring. While prior approaches have mainly focused on mitigating background bias using specialized augmentations, we thoroughly study both biases. We propose ALBAR, a novel adversarial training method that mitigates foreground and background biases without requiring specialized knowledge of the bias attributes. Our framework applies an adversarial cross-entropy loss to the sampled static clip (where all the frames are the same) and aims to make its class probabilities uniform using a proposed entropy maximization loss. Additionally, we introduce a gradient penalty loss for regularization against the debiasing process. We evaluate our method on established background and foreground bias protocols, setting a new state-of-the-art and strongly improving combined debiasing performance by over 12% on HMDB51. Furthermore, we identify an issue of background leakage in the existing UCF101 protocol for bias evaluation which provides a shortcut to predict actions and does not provide an accurate measure of the debiasing capability of a model. We address this issue by proposing more fine-grained segmentation boundaries for the actor, where our method also outperforms existing approaches. Project Page: this https URL
Abstract (translated)
机器学习模型中的偏见可能导致不公平的决策制定。虽然在图像和文本领域中这一问题已经得到了广泛研究,但在动作识别领域的相关探讨却相对较少。动作识别模型常常会受到背景偏差(即基于背景线索推断动作)和前景偏差(即依赖于主体外观)的影响,这对自动驾驶汽车或辅助生活监控等现实应用来说可能产生不利影响。 尽管先前的方法主要集中在使用专业增广技术来缓解背景偏见,但我们的研究全面地考察了这两种偏差。我们提出了一种名为ALBAR的新型对抗训练方法,该方法可以在不需要特定偏见属性知识的情况下减轻前景和背景偏见。我们的框架应用了一个对抗交叉熵损失到抽样的静态片段(所有帧相同),并试图通过提议的熵最大化损失使这些片段的概率分布趋于均匀化。此外,我们引入了一种梯度惩罚损失来对去偏过程进行正则化。 我们在已建立的背景和前景偏差协议上评估了我们的方法,并设立了一个新的最先进状态,在HMDB51数据集上的综合去偏性能提升了超过12%。此外,我们发现现有的UCF101偏见评价协议中存在一个“背景泄漏”问题,即提供了一条预测动作的捷径,无法准确衡量模型的去偏能力。为解决此问题,我们提出了更精细的动作者分割边界,并且我们的方法在此场景下也优于现有方法。 项目页面:[这个链接](https://this-url-is-to-be-replaced.com/)
URL
https://arxiv.org/abs/2502.00156