Deep Learning Approaches for Image Retrieval and Pattern Spotting in Ancient Documents

2019-07-22 16:27:19

Kelly Lais Wiggers, Alceu de Souza Britto Junior, Alessandro Lameiras Koerich, Laurent Heutte, Luiz Eduardo Soares de Oliveira

arXiv_CV

Abstract
Abstract (translated)
URL
PDF

Abstract

This paper describes two approaches for content-based image retrieval and pattern spotting in document images using deep learning. The first approach uses a pre-trained CNN model to cope with the lack of training data, which is fine-tuned to achieve a compact yet discriminant representation of queries and image candidates. The second approach uses a Siamese Convolution Neural Network trained on a previously prepared subset of image pairs from the ImageNet dataset to provide the similarity-based feature maps. In both methods, the learned representation scheme considers feature maps of different sizes which are evaluated in terms of retrieval performance. A robust experimental protocol using two public datasets (Tobacoo-800 and DocExplore) has shown that the proposed methods compare favorably against state-of-the-art document image retrieval and pattern spotting methods.

Abstract (translated)

URL

https://arxiv.org/abs/1907.09404

PDF

https://arxiv.org/pdf/1907.09404.pdf