Graphonomy: Universal Human Parsing via Graph Transfer Learning

Abstract
Abstract (translated)
URL
PDF

Abstract

Prior highly-tuned human parsing models tend to fit towards each dataset in a specific domain or with discrepant label granularity, and can hardly be adapted to other human parsing tasks without extensive re-training. In this paper, we aim to learn a single universal human parsing model that can tackle all kinds of human parsing needs by unifying label annotations from different domains or at various levels of granularity. This poses many fundamental learning challenges, e.g. discovering underlying semantic structures among different label granularity, performing proper transfer learning across different image domains, and identifying and utilizing label redundancies across related tasks. To address these challenges, we propose a new universal human parsing agent, named "Graphonomy", which incorporates hierarchical graph transfer learning upon the conventional parsing network to encode the underlying label semantic structures and propagate relevant semantic information. In particular, Graphonomy first learns and propagates compact high-level graph representation among the labels within one dataset via Intra-Graph Reasoning, and then transfers semantic information across multiple datasets via Inter-Graph Transfer. Various graph transfer dependencies (\eg, similarity, linguistic knowledge) between different datasets are analyzed and encoded to enhance graph transfer capability. By distilling universal semantic graph representation to each specific task, Graphonomy is able to predict all levels of parsing labels in one system without piling up the complexity. Experimental results show Graphonomy effectively achieves the state-of-the-art results on three human parsing benchmarks as well as advantageous universal human parsing performance.

Abstract (translated)

以前高度调整的人工解析模型倾向于适应特定领域中的每个数据集，或者具有不一致的标签粒度，如果没有广泛的重新培训，很难适应其他人工解析任务。在本文中，我们的目标是学习一个单一的通用人类解析模型，通过统一来自不同域或不同粒度级别的标签注释，可以满足各种人类解析需求。这带来了许多基本的学习挑战，例如，发现不同标签粒度之间的底层语义结构，跨不同图像域执行正确的传输学习，以及跨相关任务识别和利用标签冗余。

URL

https://arxiv.org/abs/1904.04536

PDF

https://arxiv.org/pdf/1904.04536.pdf