Paper Reading AI Learner

Deep Residual Autoencoder for quality independent JPEG restoration

2019-03-14 16:51:18
Simone Zini, Simone Bianco, Raimondo Schettini

Abstract

In this paper we propose a deep residual autoencoder exploiting Residual-in-Residual Dense Blocks (RRDB) to remove artifacts in JPEG compressed images that is independent from the Quality Factor (QF) used. The proposed approach leverages both the learning capacity of deep residual networks and prior knowledge of the JPEG compression pipeline. The proposed model operates in the YCbCr color space and performs JPEG artifact restoration in two phases using two different autoencoders: the first one restores the luma channel exploiting 2D convolutions; the second one, using the restored luma channel as a guide, restores the chroma channels explotining 3D convolutions. Extensive experimental results on three widely used benchmark datasets (i.e. LIVE1, BDS500, and CLASSIC-5) show that our model is able to outperform the state of the art with respect to all the evaluation metrics considered (i.e. PSNR, PSNR-B, and SSIM). This results is remarkable since the approaches in the state of the art use a different set of weights for each compression quality, while the proposed model uses the same weights for all of them, making it applicable to images in the wild where the QF used for compression is unkwnown. Furthermore, the proposed model shows a greater robustness than state-of-the-art methods when applied to compression qualities not seen during training.

Abstract (translated)

本文提出了一种利用剩余密度块(RRDB)中的剩余深度自动编码器来去除JPEG压缩图像中的伪影,这种伪影与所使用的质量因子(QF)无关。该方法利用了深层剩余网络的学习能力和jpeg压缩管道的先验知识。该模型在ycbcr颜色空间中运行,并使用两种不同的自动编码器分两个阶段执行jpeg伪影恢复:第一个阶段使用二维卷积恢复luma通道;第二个阶段使用恢复的luma通道作为指南恢复用于解释三维卷积的色度通道。在三个广泛使用的基准数据集(即Live1、BDS500和Classic-5)上进行的大量实验结果表明,我们的模型在所有考虑的评估指标(即PSNR、PSNR-B和SSIM)方面都优于最新水平。这一结果是值得注意的,因为最新技术的方法对每个压缩质量使用不同的权重集,而建议的模型对所有这些权重都使用相同的权重,使其适用于不知道用于压缩的qf的野外图像。此外,当应用于训练过程中看不到的压缩质量时,所提出的模型显示出比最先进的方法更强大的鲁棒性。

URL

https://arxiv.org/abs/1903.06117

PDF

https://arxiv.org/pdf/1903.06117.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot