Technical Report on Web-based Visual Corpus Construction for Visual Document Understanding

2022-11-07 01:24:43

Donghyun Kim, Teakgyu Hong, Moonbin Yim, Yoonsik Kim, Geewook Kim

arXiv_AI

arXiv_AI Knowledge Pose

Abstract
Abstract (translated)
URL
PDF

Abstract

We present a dataset generator engine named Web-based Visual Corpus Builder (Webvicob). Webvicob can readily construct a large-scale visual corpus (i.e., images with text annotations) from a raw Wikipedia HTML dump. In this report, we validate that Webvicob-generated data can cover a wide range of context and knowledge and helps practitioners to build a powerful Visual Document Understanding (VDU) backbone. The proposed engine is publicly available at this https URL.

Abstract (translated)

URL

https://arxiv.org/abs/2211.03256

PDF

https://arxiv.org/pdf/2211.03256.pdf