Abstract
In this work, we explore massive pre-training on synthetic word images for enhancing the performance on four benchmark downstream handwriting analysis tasks. To this end, we build a large synthetic dataset of word images rendered in several handwriting fonts, which offers a complete supervision signal. We use it to train a simple convolutional neural network (ConvNet) with a fully supervised objective. The vector representations of the images obtained from the pre-trained ConvNet can then be considered as encodings of the handwriting style. We exploit such representations for Writer Retrieval, Writer Identification, Writer Verification, and Writer Classification and demonstrate that our pre-training strategy allows extracting rich representations of the writers' style that enable the aforementioned tasks with competitive results with respect to task-specific State-of-the-Art approaches.
Abstract (translated)
在本研究中,我们探讨了在合成单词图像上进行大规模预训练,以增强四基准手写分析任务的表现。为此,我们构建了一个大型合成字体渲染的单词图像数据集,提供了完整的监督信号。我们使用它训练了一个简单卷积神经网络(ConvNet),该网络具有完全监督目标。从预训练的卷积神经网络中获取的图像的向量表示可以被视为手写风格编码。我们利用这些表示用于作家检索、作家识别、作家验证和作家分类,并证明我们的预训练策略允许提取作家风格的丰富表示,从而使上述任务在任务特定的最新方法中具有竞争力的结果。
URL
https://arxiv.org/abs/2304.01842