Bi-Directional Domain Translation for Zero-Shot Sketch-Based Image Retrieval

2019-11-29 17:43:45

Jiangtong Li, Zhixin Ling, Li Niu, Liqing Zhang

arXiv_CV

Abstract
Abstract (translated)
URL
PDF

Abstract

The goal of Sketch-Based Image Retrieval (SBIR) is using free-hand sketches to retrieve images of the same category from a natural image gallery. However, SBIR requires all categories to be seen during training, which cannot be guaranteed in real-world applications. So we investigate more challenging Zero-Shot SBIR (ZS-SBIR), in which test categories do not appear in the training stage. Traditional SBIR methods are prone to be category-based retrieval and cannot generalize well from seen categories to unseen ones. In contrast, we disentangle image features into structure features and appearance features to facilitate structure-based retrieval. To assist feature disentanglement and take full advantage of disentangled information, we propose a Bi-directional Domain Translation (BDT) framework for ZS-SBIR, in which the image domain and sketch domain can be translated to each other through disentangled structure and appearance features. Finally, we perform retrieval in both structure feature space and image feature space. Extensive experiments demonstrate that our proposed approach remarkably outperforms state-of-the-art approaches by about 8% on the Sketchy dataset and over 5% on the TU-Berlin dataset.

Abstract (translated)

URL

https://arxiv.org/abs/1911.13251

PDF

https://arxiv.org/pdf/1911.13251.pdf