Paper Reading AI Learner

Smooth-Swap: A Simple Enhancement for Face-Swapping with Smoothness

2021-12-11 03:26:32
Jiseob Kim, Jihoon Lee, Byoung-Tak Zhang

Abstract

In recent years, face-swapping models have progressed in generation quality and drawn attention for their applications in privacy protection and entertainment. However, their complex architectures and loss functions often require careful tuning for successful training. In this paper, we propose a new face-swapping model called `Smooth-Swap', which focuses on deriving the smoothness of the identity embedding instead of employing complex handcrafted designs. We postulate that the gist of the difficulty in face-swapping is unstable gradients and it can be resolved by a smooth identity embedder. Smooth-swap adopts an embedder trained using supervised contrastive learning, where we find its improved smoothness allows faster and stable training even with a simple U-Net-based generator and three basic loss functions. Extensive experiments on face-swapping benchmarks (FFHQ, FaceForensics++) and face images in the wild show that our model is also quantitatively and qualitatively comparable or even superior to existing methods in terms of identity change.

Abstract (translated)

URL

https://arxiv.org/abs/2112.05907

PDF

https://arxiv.org/pdf/2112.05907.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot