Paper Reading AI Learner

Keypoint-Guided Optimal Transport

2023-03-23 08:35:56
Xiang Gu, Yucheng Yang, Wei Zeng, Jian Sun, Zongben Xu

Abstract

Existing Optimal Transport (OT) methods mainly derive the optimal transport plan/matching under the criterion of transport cost/distance minimization, which may cause incorrect matching in some cases. In many applications, annotating a few matched keypoints across domains is reasonable or even effortless in annotation burden. It is valuable to investigate how to leverage the annotated keypoints to guide the correct matching in OT. In this paper, we propose a novel KeyPoint-Guided model by ReLation preservation (KPG-RL) that searches for the optimal matching (i.e., transport plan) guided by the keypoints in OT. To impose the keypoints in OT, first, we propose a mask-based constraint of the transport plan that preserves the matching of keypoint pairs. Second, we propose to preserve the relation of each data point to the keypoints to guide the matching. The proposed KPG-RL model can be solved by Sinkhorn's algorithm and is applicable even when distributions are supported in different spaces. We further utilize the relation preservation constraint in the Kantorovich Problem and Gromov-Wasserstein model to impose the guidance of keypoints in them. Meanwhile, the proposed KPG-RL model is extended to the partial OT setting. Moreover, we deduce the dual formulation of the KPG-RL model, which is solved using deep learning techniques. Based on the learned transport plan from dual KPG-RL, we propose a novel manifold barycentric projection to transport source data to the target domain. As applications, we apply the proposed KPG-RL model to the heterogeneous domain adaptation and image-to-image translation. Experiments verified the effectiveness of the proposed approach.

Abstract (translated)

现有的最优传输(OT)方法主要基于运输成本/距离最小化的标准来推导最优传输计划/匹配,这可能在某些情况下导致不匹配。在许多应用中,对跨域匹配的一些关键点进行注释是合理的,甚至注释负担更轻松。研究如何利用注释的关键点在OT中指导正确的匹配非常重要。在本文中,我们提出了一种新的关键点引导模型,称为ReLation preservation(KPG-RL),它搜索最优匹配(即传输计划)由OT中的关键点引导。为了在OT中强加关键点,我们首先提出了基于掩膜的运输计划约束,以保留匹配关键点的一对关键点。其次,我们提出了保留每个数据点与关键点的关系以指导匹配。提出的KPG-RL模型可以使用Sinkhorn算法解决,即使在不同的空间中支持分布的情况下也适用。我们还利用 Kantorovich Problem和Gromov-Wasserstein模型中的关系保留约束来强加关键点的指导。同时,我们推导了KPG-RL模型的 dual 形式,该形式使用深度学习技术解决。基于从双KPG-RL模型中学习的运输计划,我们提出了一种独特的多平面巴尔干投影,以将源数据传输到目标域。作为应用,我们应用提出的KPG-RL模型到异质域适应和图像到图像翻译。实验证实了提出的这种方法的 effectiveness。

URL

https://arxiv.org/abs/2303.13102

PDF

https://arxiv.org/pdf/2303.13102.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot