Abstract
Convolution Neural Networks (CNNs) have been used in various fields and are showing demonstrated excellent performance, especially in Single-Image Super Resolution (SISR). However, recently, CNN-based SISR has numerous parameters and computational costs for obtaining better performance. As one of the methods to make the network efficient, Knowledge Distillation (KD) which optimizes the performance trade-off by adding a loss term to the existing network architecture is currently being studied. KD for SISR is mainly proposed as a feature distillation (FD) to minimize L1-distance loss of feature maps between teacher and student networks, but it does not fully take into account the amount and importance of information that the student can accept. In this paper, we propose a feature-based adaptive contrastive distillation (FACD) method for efficiently training lightweight SISR networks. We show the limitations of the existing feature-distillation (FD) with L1-distance loss, and propose a feature-based contrastive loss that maximizes the mutual information between the feature maps of the teacher and student networks. The experimental results show that the proposed FACD improves not only the PSNR performance of the entire benchmark datasets and scales but also the subjective image quality compared to the conventional FD approach.
Abstract (translated)
URL
https://arxiv.org/abs/2211.15951