Abstract
This paper presents the UniMER dataset to provide the first study on Mathematical Expression Recognition (MER) towards complex real-world scenarios. The UniMER dataset consists of a large-scale training set UniMER-1M offering an unprecedented scale and diversity with one million training instances and a meticulously designed test set UniMER-Test that reflects a diverse range of formula distributions prevalent in real-world scenarios. Therefore, the UniMER dataset enables the training of a robust and high-accuracy MER model and comprehensive evaluation of model performance. Moreover, we introduce the Universal Mathematical Expression Recognition Network (UniMERNet), an innovative framework designed to enhance MER in practical scenarios. UniMERNet incorporates a Length-Aware Module to process formulas of varied lengths efficiently, thereby enabling the model to handle complex mathematical expressions with greater accuracy. In addition, UniMERNet employs our UniMER-1M data and image augmentation techniques to improve the model's robustness under different noise conditions. Our extensive experiments demonstrate that UniMERNet outperforms existing MER models, setting a new benchmark in various scenarios and ensuring superior recognition quality in real-world applications. The dataset and model are available at this https URL.
Abstract (translated)
本文介绍了UniMER数据集,以提供数学表达识别(MER)在复杂现实场景中的第一研究。UniMER数据集包括一个大规模训练集UniMER-1M,提供前所未有的规模和多样性,以及一个精心设计的测试集UniMER-Test,反映了现实场景中普遍存在的公式分布。因此,UniMER数据集使得训练具有稳健和高精度的MER模型,全面评估模型性能成为可能。此外,我们引入了通用数学表达识别网络(UniMERNet),一种旨在增强MER在实际场景中的框架。UniMERNet包括一个长度感知模块,以处理不同长度的公式,从而使模型能够更准确地处理复杂数学表达。此外,UniMERNet利用我们的UniMER-1M数据和图像增强技术,在不同噪声条件下提高模型的稳健性。我们广泛的实验证明,UniMERNet在各种场景中优于现有MER模型,为各种应用场景树立了新的基准,并确保在现实应用中具有卓越的识别质量。数据集和模型可通过此链接获取:https://url.cn/xyz6h
URL
https://arxiv.org/abs/2404.15254