Abstract
In the context of label-efficient learning on video data, the distillation method and the structural design of the teacher-student architecture have a significant impact on knowledge distillation. However, the relationship between these factors has been overlooked in previous research. To address this gap, we propose a new weakly supervised learning framework for knowledge distillation in video classification that is designed to improve the efficiency and accuracy of the student model. Our approach leverages the concept of substage-based learning to distill knowledge based on the combination of student substages and the correlation of corresponding substages. We also employ the progressive cascade training method to address the accuracy loss caused by the large capacity gap between the teacher and the student. Additionally, we propose a pseudo-label optimization strategy to improve the initial data label. To optimize the loss functions of different distillation substages during the training process, we introduce a new loss method based on feature distribution. We conduct extensive experiments on both real and simulated data sets, demonstrating that our proposed approach outperforms existing distillation methods in terms of knowledge distillation for video classification tasks. Our proposed substage-based distillation approach has the potential to inform future research on label-efficient learning for video data.
Abstract (translated)
在视频数据标签高效的学习中,分阶段学习方法和教师和学生架构的结构设计对知识蒸馏有重要影响。然而,在以前的研究中,这些因素的影响被忽视了。为了解决这个问题,我们提出了一种新的弱监督学习框架,用于视频分类任务的知识蒸馏,旨在提高学生模型的效率和准确性。我们利用分阶段学习的概念,通过学生子阶段和对应子阶段之间的组合来蒸馏知识。我们还使用逐步分级训练方法来解决教师和学生之间容量差距造成的精度损失。此外,我们提出了一种伪标签优化策略来改善初始数据标签。为了优化不同蒸馏子阶段的损失函数,在训练过程中我们介绍了基于特征分布的新损失方法。我们在不同的真实和模拟数据集上进行了广泛的实验,表明我们提出的方法在视频分类任务中的知识蒸馏方面优于现有的蒸馏方法。我们提出的基于子阶段的知识蒸馏方法有潜力为未来的视频数据标签高效的学习提供信息。
URL
https://arxiv.org/abs/2307.05201