Abstract
Previous work in Neural Loss Function Search (NLFS) has shown a lack of correlation between smaller surrogate functions and large convolutional neural networks with massive regularization. We expand upon this research by revealing another disparity that exists, correlation between different types of image augmentation techniques. We show that different loss functions can perform well on certain image augmentation techniques, while performing poorly on others. We exploit this disparity by performing an evolutionary search on five types of image augmentation techniques in the hopes of finding image augmentation specific loss functions. The best loss functions from each evolution were then taken and transferred to WideResNet-28-10 on CIFAR-10 and CIFAR-100 across each of the five image augmentation techniques. The best from that were then taken and evaluated by fine-tuning EfficientNetV2Small on the CARS, Oxford-Flowers, and Caltech datasets across each of the five image augmentation techniques. Multiple loss functions were found that outperformed cross-entropy across multiple experiments. In the end, we found a single loss function, which we called the inverse bessel logarithm loss, that was able to outperform cross-entropy across the majority of experiments.
Abstract (translated)
之前在神经损失函数搜索(NLFS)方面的研究表明,较小代理函数与具有大量正则化的巨大卷积神经网络之间缺乏相关性。我们通过揭示另一种存在于不同类型图像增强技术之间的差异来拓展这一研究。我们证明了不同损失函数可以在某些图像增强技术上表现良好,而在其他技术上表现不佳。我们利用这一差异进行进化搜索,旨在找到特定于图像增强的损失函数。对每个进化,最优秀的损失函数被选取并转移到CIFAR-10和CIFAR-100上。然后将这些最优秀的损失函数应用于WideResNet-28-10,在每种图像增强技术上进行微调。然后在每个图像增强技术上,最优秀的损失函数被用于在CARS、牛津鲜花和加州理工(Caltech)数据集上进行微调的EfficientNetV2Small。我们发现了多个在多个实验中优于交叉熵的损失函数。最后,我们发现了一个单一的损失函数,我们称之为反贝塞尔对数损失,它在大多数实验中能够优于交叉熵。
URL
https://arxiv.org/abs/2404.06633