Paper Reading AI Learner

Open-World Pose Transfer via Sequential Test-Time Adaption

2023-03-20 09:01:23
Junyang Chen, Xiaoyu Xian, Zhijing Yang, Tianshui Chen, Yongyi Lu, Yukai Shi, Jinshan Pan, Liang Lin

Abstract

Pose transfer aims to transfer a given person into a specified posture, has recently attracted considerable attention. A typical pose transfer framework usually employs representative datasets to train a discriminative model, which is often violated by out-of-distribution (OOD) instances. Recently, test-time adaption (TTA) offers a feasible solution for OOD data by using a pre-trained model that learns essential features with self-supervision. However, those methods implicitly make an assumption that all test distributions have a unified signal that can be learned directly. In open-world conditions, the pose transfer task raises various independent signals: OOD appearance and skeleton, which need to be extracted and distributed in speciality. To address this point, we develop a SEquential Test-time Adaption (SETA). In the test-time phrase, SETA extracts and distributes external appearance texture by augmenting OOD data for self-supervised training. To make non-Euclidean similarity among different postures explicit, SETA uses the image representations derived from a person re-identification (Re-ID) model for similarity computation. By addressing implicit posture representation in the test-time sequentially, SETA greatly improves the generalization performance of current pose transfer models. In our experiment, we first show that pose transfer can be applied to open-world applications, including Tiktok reenactment and celebrity motion synthesis.

Abstract (translated)

姿态转移的目标是将给定的人转移到指定的姿势,最近吸引了相当大的关注。典型的姿态转移框架通常使用代表性的数据集来训练一个鉴别模型,这常常受到分布之外(OOD)实例的违反。最近,测试时间适应(TTA)提供了一个可行的解决方案,使用一个自监督学习的训练模型来学习关键特征,以自学训练。然而,这些方法隐含地假设所有测试分布都有一个通用的信号,可以直接学习。在开放世界条件下,姿态转移任务产生了各种独立的信号:OOD的外观和骨骼,需要提取和分布在特定领域的专业知识。为了解决这一问题,我们开发了平方测试时间适应(SETA)。在测试时间短语中,SETA通过增加OOD数据来提高外部外观纹理,以自学训练。为了使不同姿势之间的非欧几何相似性更加明显,SETA使用从人身份验证(Re-ID)模型推导的图像表示来进行相似性计算。通过在测试时间顺序解决暗示的姿态表示问题,SETA极大地改善了当前姿态转移模型的泛化性能。在我们的实验中,我们首先展示了姿态转移可以应用于开放世界应用程序,包括 Tiktok重编和名人运动合成。

URL

https://arxiv.org/abs/2303.10945

PDF

https://arxiv.org/pdf/2303.10945.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot