Compositionally Generalizable 3D Structure Prediction

2020-12-04 09:53:14

Songfang Han, Jiayuan Gu, Kaichun Mo, Li Yi, Siyu Hu, Xuejin Chen, Hao Su

arXiv_CV

arXiv_CV Deep_Learning Relation Prediction Pose 3D Reconstruction

Abstract
Abstract (translated)
URL
PDF

Abstract

Single-image 3D shape reconstruction is an important and long-standing problem in computer vision. A plethora of existing works is constantly pushing the state-of-the-art performance in the deep learning era. However, there remains a much difficult and largely under-explored issue on how to generalize the learned skills over novel unseen object categories that have very different shape geometry distribution. In this paper, we bring in the concept of compositional generalizability and propose a novel framework that factorizes the 3D shape reconstruction problem into proper sub-problems, each of which is tackled by a carefully designed neural sub-module with generalizability guarantee. The intuition behind our formulation is that object parts (slates and cylindrical parts), their relationships (adjacency, equal-length, and parallelism) and shape substructures (T-junctions and a symmetric group of parts) are mostly shared across object categories, even though the object geometry may look very different (chairs and cabinets). Experiments on PartNet show that we achieve superior performance than baseline methods, which validates our problem factorization and network designs.

Abstract (translated)

URL

https://arxiv.org/abs/2012.02493

PDF

https://arxiv.org/pdf/2012.02493.pdf