Paper Reading AI Learner

Learning Linear Transformations for Fast Arbitrary Style Transfer

2018-08-14 05:45:20
Xueting Li, Sifei Liu, Jan Kautz, Ming-Hsuan Yang

Abstract

Given a random pair of images, an arbitrary style transfer method extracts the feel from the reference image to synthesize an output based on the look of the other content image. Recent arbitrary style transfer methods transfer second order statistics from reference image onto content image via a multiplication between content image features and a transformation matrix, which is computed from features with a pre-determined algorithm. These algorithms either require computationally expensive operations, or fail to model the feature covariance and produce artifacts in synthesized images. Generalized from these methods, in this work, we derive the form of transformation matrix theoretically and present an arbitrary style transfer approach that learns the transformation matrix with a feed-forward network. Our algorithm is highly efficient yet allows a flexible combination of multi-level styles while preserving content affinity during style transfer process. We demonstrate the effectiveness of our approach on four tasks: artistic style transfer, video and photo-realistic style transfer as well as domain adaptation, including comparisons with the state-of-the-art methods.

Abstract (translated)

给定随机图像对,任意样式转移方法从参考图像中提取感觉以基于另一内容图像的外观合成输出。最近的任意样式转移方法通过内容图像特征和变换矩阵之间的乘法将二阶统计量从参考图像转移到内容图像上,该变换矩阵是利用具有预定算法的特征计算的。这些算法要么需要计算上昂贵的操作,要么无法对特征协方差进行建模并在合成图像中产生伪像。从这些方法推广出来,在这项工作中,我们从理论上推导出变换矩阵的形式,并提出了一种任意的样式转移方法,该方法通过前馈网络来学习变换矩阵。我们的算法非常高效,但允许灵活组合多级样式,同时在样式传输过程中保持内容亲和性。我们展示了我们的方法在四个任务上的有效性:艺术风格转移,视频和照片般逼真的风格转移以及领域适应,包括与最先进的方法的比较。

URL

https://arxiv.org/abs/1808.04537

PDF

https://arxiv.org/pdf/1808.04537.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot