Temporally Coherent Person Matting Trained on Fake-Motion Dataset

2021-09-10 12:53:11

Ivan Molodetskikh, Mikhail Erofeev, Andrey Moskalenko, Dmitry Vatolin

arXiv_CV

arXiv_CV Segmentation RNN CNN Pose Optical_Flow

Abstract
Abstract (translated)
URL
PDF

Abstract

We propose a novel neural-network-based method to perform matting of videos depicting people that does not require additional user input such as trimaps. Our architecture achieves temporal stability of the resulting alpha mattes by using motion-estimation-based smoothing of image-segmentation algorithm outputs, combined with convolutional-LSTM modules on U-Net skip connections. We also propose a fake-motion algorithm that generates training clips for the video-matting network given photos with ground-truth alpha mattes and background videos. We apply random motion to photos and their mattes to simulate movement one would find in real videos and composite the result with the background clips. It lets us train a deep neural network operating on videos in an absence of a large annotated video dataset and provides ground-truth training-clip foreground optical flow for use in loss functions.

Abstract (translated)

URL

https://arxiv.org/abs/2109.04843

PDF

https://arxiv.org/pdf/2109.04843.pdf