Paper Reading AI Learner

Dual-Side Feature Fusion 3D Pose Transfer

2023-05-24 09:42:08
Jue Liu, Feipeng Da

Abstract

3D pose transfer solves the problem of additional input and correspondence of traditional deformation transfer, only the source and target meshes need to be input, and the pose of the source mesh can be transferred to the target mesh. Some lightweight methods proposed in recent years consume less memory but cause spikes and distortions for some unseen poses, while others are costly in training due to the inclusion of large matrix multiplication and adversarial networks. In addition, the meshes with different numbers of vertices also increase the difficulty of pose transfer. In this work, we propose a Dual-Side Feature Fusion Pose Transfer Network to improve the pose transfer accuracy of the lightweight method. Our method takes the pose features as one of the side inputs to the decoding network and fuses them into the target mesh layer by layer at multiple scales. Our proposed Feature Fusion Adaptive Instance Normalization has the characteristic of having two side input channels that fuse pose features and identity features as denormalization parameters, thus enhancing the pose transfer capability of the network. Extensive experimental results show that our proposed method has stronger pose transfer capability than state-of-the-art methods while maintaining a lightweight network structure, and can converge faster.

Abstract (translated)

3D 姿态转移解决了传统变形转移额外的输入和对应问题,只需要输入源和目标网格,可以将源网格的姿态转移到目标网格。近年来提出的一些轻量级方法虽然消耗较少内存,但对一些未知的姿态会引起尖点和扭曲,而另一些方法在训练时因为包含大型矩阵乘法和对抗网络而成本较高。此外,不同数量的顶点网格也增加了姿态转移的难度。在本研究中,我们提出了一种双重界面特征融合姿态转移网络,以提高轻量级方法的姿态转移精度。我们的方法将姿态特征作为侧面输入到解码网络中,并逐层将它们与目标网格的特征层融合。我们提出的特征融合自适应实例归一化具有两个侧面输入通道的特征,将姿态特征和身份特征作为归一化参数,从而增强网络的姿态转移能力。广泛的实验结果表明,我们提出的方法具有比当前方法更强的姿态转移能力,同时保持轻量级网络结构,并且可以更快地收敛。

URL

https://arxiv.org/abs/2305.14951

PDF

https://arxiv.org/pdf/2305.14951.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot