Paper Reading AI Learner

CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective

2024-04-22 11:52:40
Wencheng Zhu, Xin Zhou, Pengfei Zhu, Yu Wang, Qinghua Hu

Abstract

In this paper, we present a simple yet effective contrastive knowledge distillation approach, which can be formulated as a sample-wise alignment problem with intra- and inter-sample constraints. Unlike traditional knowledge distillation methods that concentrate on maximizing feature similarities or preserving class-wise semantic correlations between teacher and student features, our method attempts to recover the "dark knowledge" by aligning sample-wise teacher and student logits. Specifically, our method first minimizes logit differences within the same sample by considering their numerical values, thus preserving intra-sample similarities. Next, we bridge semantic disparities by leveraging dissimilarities across different samples. Note that constraints on intra-sample similarities and inter-sample dissimilarities can be efficiently and effectively reformulated into a contrastive learning framework with newly designed positive and negative pairs. The positive pair consists of the teacher's and student's logits derived from an identical sample, while the negative pairs are formed by using logits from different samples. With this formulation, our method benefits from the simplicity and efficiency of contrastive learning through the optimization of InfoNCE, yielding a run-time complexity that is far less than $O(n^2)$, where $n$ represents the total number of training samples. Furthermore, our method can eliminate the need for hyperparameter tuning, particularly related to temperature parameters and large batch sizes. We conduct comprehensive experiments on three datasets including CIFAR-100, ImageNet-1K, and MS COCO. Experimental results clearly confirm the effectiveness of the proposed method on both image classification and object detection tasks. Our source codes will be publicly available at this https URL.

Abstract (translated)

在本文中,我们提出了一种简单而有效的对比性知识蒸馏方法,可以将其表述为样本层面的对齐问题,具有内部样本和跨样本约束。与传统的知识蒸馏方法不同,该方法试图通过将样本层面的教师和学生的对数值对齐来恢复“暗知识”,具体来说,我们的方法首先通过考虑它们的数值值来最小化同一样本内的对数值差异,从而保留内部样本相似性。接下来,我们通过利用不同样本之间的差异来桥通常的语义差异。需要注意的是,对内部样本相似性和跨样本差异的限制可以有效地转化为一个新的设计的有向二进制对齐学习框架。其中一对正对是由相同的样本生成的教师和学生的对数值,而负对则是由不同样本的推理得出的。通过这种表示方法,我们的方法通过优化InfoNCE实现了对比学习的高效性和效率,其运行时间复杂度远低于$O(n^2)$,其中$n$表示训练样本的总数。此外,我们的方法可以消除关于温度参数和大批量的超参数 tuning需求,特别与温度参数和大的批量大小的相关。我们对包括CIFAR-100、ImageNet-1K和MS COCO在内的三个数据集进行了全面的实验。实验结果明确证实了所提出方法在图像分类和目标检测任务上的有效性。我们的源代码将公开发布在这个https URL上。

URL

https://arxiv.org/abs/2404.14109

PDF

https://arxiv.org/pdf/2404.14109.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot