Paper Reading AI Learner

Training Image Estimators without Image Ground-Truth

2019-06-13 16:04:03
Zhihao Xia, Ayan Chakrabarti

Abstract

Deep neural networks have been very successful in image estimation applications such as compressive-sensing and image restoration, as a means to estimate images from partial, blurry, or otherwise degraded measurements. These networks are trained on a large number of corresponding pairs of measurements and ground-truth images, and thus implicitly learn to exploit domain-specific image statistics. But unlike measurement data, it is often expensive or impractical to collect a large training set of ground-truth images in many application settings. In this paper, we introduce an unsupervised framework for training image estimation networks, from a training set that contains only measurements---with two varied measurements per image---but no ground-truth for the full images desired as output. We demonstrate that our framework can be applied for both regular and blind image estimation tasks, where in the latter case parameters of the measurement model (e.g., the blur kernel) are unknown: during inference, and potentially, also during training. We evaluate our method for training networks for compressive-sensing and blind deconvolution, considering both non-blind and blind training for the latter. Our unsupervised framework yields models that are nearly as accurate as those from fully supervised training, despite not having access to any ground-truth images.

Abstract (translated)

深部神经网络作为一种从局部测量、模糊测量或其它退化测量中估计图像的手段,在压缩传感和图像恢复等图像估计方面取得了很大的成功。这些网络在大量相应的测量对和地面真值图像上进行训练,从而含蓄地学习利用特定领域的图像统计。但是,与测量数据不同,在许多应用程序设置中收集大量的地面真值图像训练集通常是昂贵或不切实际的。在本文中,我们介绍了一个训练图像估计网络的无监督框架,从一个只包含测量值的训练集开始——每个图像有两个不同的测量值——但是对于输出所需的完整图像没有地面真实性。我们证明了我们的框架可以应用于常规和盲图像估计任务,在后一种情况下,测量模型的参数(如模糊内核)是未知的:在推理过程中,也可能在训练过程中。我们评估了压缩传感和盲反褶积训练网络的方法,同时考虑了后者的非盲训练和盲训练。我们的无监督框架生成的模型几乎与完全监督培训的模型一样精确,尽管没有任何地面真实图像。

URL

https://arxiv.org/abs/1906.05775

PDF

https://arxiv.org/pdf/1906.05775.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot