Paper Reading AI Learner

Trainable Loss Weights in Super-Resolution

2023-01-25 13:27:27
Arash Chaichi Mellatshahi, Shohreh Kasaei

Abstract

In recent years, research on super-resolution has primarily focused on the development of unsupervised models, blind networks, and the use of optimization methods in non-blind models. But, limited research has discussed the loss function in the super-resolution process. The majority of those studies have only used perceptual similarity in a conventional way. This is while the development of appropriate loss can improve the quality of other methods as well. In this article, a new weighting method for pixel-wise loss is proposed. With the help of this method, it is possible to use trainable weights based on the general structure of the image and its perceptual features while maintaining the advantages of pixel-wise loss. Also, a criterion for comparing weights of loss is introduced so that the weights can be estimated directly by a convolutional neural network using this criterion. In addition, in this article, the expectation-maximization method is used for the simultaneous estimation super-resolution network and weighting network. In addition, a new activation function, called "FixedSum", is introduced which can keep the sum of all components of vector constants while keeping the output components between zero and one. As shown in the experimental results section, weighted loss by the proposed method leads to better results than the unweighted loss in both signal-to-noise and perceptual similarity senses.

Abstract (translated)

过去几年,研究超分辨率的主要关注点是发展无监督模型、盲网络以及在非盲模型中使用优化方法。但是,有限的研究已经讨论了超分辨率过程中的损失函数。大多数研究只是传统地使用感知相似性。与此同时,开发适当的损失可以改进其他方法的质量。在本文中,提出了一种新的像素损失加权方法。利用这种方法,可以根据图像的一般结构和感知特征使用训练权重,同时保持像素损失的优点。此外,引入了比较损失权重的准则,以便使用该准则的卷积神经网络可以直接估计权重。此外,在本文中,使用期望最大化方法同时用于估计超分辨率网络和加权网络。此外,引入了一种称为“固定总和”的新激活函数,它可以保持向量常数组件的总和,同时保持输出组件在0到1之间的范围。正如实验结果部分所示,使用提出的方法的加权损失在信号到噪声和感知相似性感知方面优于未加权损失。

URL

https://arxiv.org/abs/2301.10575

PDF

https://arxiv.org/pdf/2301.10575.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot