Abstract
This paper studies probability distributions ofpenultimate activations of classification networks.We show that, when a classification network istrained with the cross-entropy loss, its final classi-fication layer forms aGenerative-Discriminativepairwith a generative classifier based on a specificdistribution of penultimate activations. More im-portantly, the distribution is parameterized by theweights of the final fully-connected layer, and canbe considered as a generative model that synthe-sizes the penultimate activations without feedinginput data. We empirically demonstrate that thisgenerative model enables stable knowledge dis-tillation in the presence of domain shift, and cantransfer knowledge from a classifier to variationalautoencoders and generative adversarial networksfor class-conditional image generation.
Abstract (translated)
URL
https://arxiv.org/abs/2107.01900