Paper Reading AI Learner

Automatic Temporally Coherent Video Colorization

2019-04-21 01:50:22
Harrish Thasarathan, Kamyar Nazeri, Mehran Ebrahimi

Abstract

Greyscale image colorization for applications in image restoration has seen significant improvements in recent years. Many of these techniques that use learning-based methods struggle to effectively colorize sparse inputs. With the consistent growth of the anime industry, the ability to colorize sparse input such as line art can reduce significant cost and redundant work for production studios by eliminating the in-between frame colorization process. Simply using existing methods yields inconsistent colors between related frames resulting in a flicker effect in the final video. In order to successfully automate key areas of large-scale anime production, the colorization of line arts must be temporally consistent between frames. This paper proposes a method to colorize line art frames in an adversarial setting, to create temporally coherent video of large anime by improving existing image to image translation methods. We show that by adding an extra condition to the generator and discriminator, we can effectively create temporally consistent video sequences from anime line arts. Code and models available at: https://github.com/Harry-Thasarathan/TCVC

Abstract (translated)

灰度图像彩色化在图像恢复中的应用近年来取得了显著的进步。许多使用基于学习的方法的技术都很难有效地为稀疏输入着色。随着动漫产业的不断发展,线艺术等稀疏输入的着色能力可以通过消除帧间着色过程来降低生产工作室的成本和冗余工作。简单地使用现有方法会在相关帧之间产生不一致的颜色,从而在最终视频中产生闪烁效果。为了使大规模动画制作的关键领域实现自动化,线艺术的色彩化在画面之间必须是暂时一致的。本文提出了一种在对峙环境下对线画框进行着色的方法,通过改进现有的图像到图像的翻译方法,生成大型动画的时间连贯视频。我们表明,通过在生成器和鉴别器中添加一个额外的条件,我们可以有效地从动画线艺术中创建时间一致的视频序列。代码和型号:https://github.com/harry-thasarathan/tcvc

URL

https://arxiv.org/abs/1904.09527

PDF

https://arxiv.org/pdf/1904.09527.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot