Paper Reading AI Learner

Deep Image Matting: A Comprehensive Survey

2023-04-10 15:48:55
Jizhizi Li, Jing Zhang, Dacheng Tao

Abstract

Image matting refers to extracting precise alpha matte from natural images, and it plays a critical role in various downstream applications, such as image editing. Despite being an ill-posed problem, traditional methods have been trying to solve it for decades. The emergence of deep learning has revolutionized the field of image matting and given birth to multiple new techniques, including automatic, interactive, and referring image matting. This paper presents a comprehensive review of recent advancements in image matting in the era of deep learning. We focus on two fundamental sub-tasks: auxiliary input-based image matting, which involves user-defined input to predict the alpha matte, and automatic image matting, which generates results without any manual intervention. We systematically review the existing methods for these two tasks according to their task settings and network structures and provide a summary of their advantages and disadvantages. Furthermore, we introduce the commonly used image matting datasets and evaluate the performance of representative matting methods both quantitatively and qualitatively. Finally, we discuss relevant applications of image matting and highlight existing challenges and potential opportunities for future research. We also maintain a public repository to track the rapid development of deep image matting at this https URL.

Abstract (translated)

图像剪辑(image matting)是指从自然图像中提取精确的Alpha值,它在各种后续应用中发挥着关键作用,例如图像编辑。尽管这是一个错误的难题,但传统方法已经试图解决这个问题数十年。深度学习的出现已经彻底改变了图像剪辑领域,并创造了多个新技术,包括自动、交互式和参考图像剪辑。本文全面回顾了在深度学习时代图像剪辑的最新进展。我们关注两个基本的任务:基于辅助输入的图像剪辑,它涉及用户定义输入以预测Alpha值,以及自动图像剪辑,它不需要任何手动干预而生成结果。我们按照任务设置和网络结构系统地审查了这两个任务现有的方法,并总结了它们的优势和劣势。此外,我们介绍了常用的图像剪辑数据集,并评估了代表性剪辑方法的性能,包括定量和定性表现。最后,我们讨论了图像剪辑相关的应用,并强调了现有挑战和未来研究的潜在机会。我们还在此处维护了一个公共存储库,以跟踪深度学习图像剪辑的迅速进展。

URL

https://arxiv.org/abs/2304.04672

PDF

https://arxiv.org/pdf/2304.04672.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot