Abstract
Human parsing has received considerable interest due to its wide application potentials. Nevertheless, it is still unclear how to develop an accurate human parsing system in an efficient and elegant way. In this paper, we identify several useful properties, including feature resolution, global context information and edge details, and perform rigorous analyses to reveal how to leverage them to benefit the human parsing task. The advantages of these useful properties finally result in a simple yet effective Context Embedding with Edge Perceiving (CE2P) framework for single human parsing. Our CE2P is end-to-end trainable and can be easily adopted for conducting multiple human parsing. Benefiting the superiority of CE2P, we achieved the 1st places on all three human parsing benchmarks. Without any bells and whistles, we achieved 56.50\% (mIoU), 45.31\% (mean $AP^r$) and 33.34\% ($AP^p_{0.5}$) in LIP, CIHP and MHP v2.0, which outperform the state-of-the-arts more than 2.06\%, 3.81\% and 1.87\%, respectively. We hope our CE2P will serve as a solid baseline and help ease future research in single/multiple human parsing. Code has been made available at \url{https://github.com/liutinglt/CE2P}.
Abstract (translated)
由于其广泛的应用潜力,人类解析已经获得了相当大的兴趣。然而,目前尚不清楚如何以高效和优雅的方式开发出精确的人体解析系统。在本文中,我们确定了几个有用的属性,包括特征分辨率,全局上下文信息和边缘细节,并执行严格的分析以揭示如何利用它们来使人类解析任务受益。这些有用属性的优点最终导致一个简单而有效的Context Encedding with Edge Perceiving(CE2P)框架,用于单人解析。我们的CE2P是端到端的可训练的,可以很容易地用于进行多种人工解析。受益于CE2P的优势,我们在所有三个人类解析基准测试中取得了第一名。没有任何花里胡哨,我们在LIP,CIHP和MHP v2.0中达到56.50 \%(mIoU),45.31 \%(平均$ AP ^ r $)和33.34 \%($ AP ^ p_ {0.5} $),其表现优于现有技术,分别超过2.06%,3.81%和1.87%。我们希望我们的CE2P将成为一个坚实的基线,并有助于简化未来单/多人解析的研究。代码已在\ url {https://github.com/liutinglt/CE2P}上提供。
URL
https://arxiv.org/abs/1809.05996