Abstract
Real-world applications of computer vision in the humanities require algorithms to be robust against artistic abstraction, peripheral objects, and subtle differences between fine-grained target classes. Existing datasets provide instance-level annotations on artworks but are generally biased towards the image centre and limited with regard to detailed object classes. The proposed ODOR dataset fills this gap, offering 38,116 object-level annotations across 4712 images, spanning an extensive set of 139 fine-grained categories. Conducting a statistical analysis, we showcase challenging dataset properties, such as a detailed set of categories, dense and overlapping objects, and spatial distribution over the whole image canvas. Furthermore, we provide an extensive baseline analysis for object detection models and highlight the challenging properties of the dataset through a set of secondary studies. Inspiring further research on artwork object detection and broader visual cultural heritage studies, the dataset challenges researchers to explore the intersection of object recognition and smell perception.
Abstract (translated)
在人文领域的计算机视觉实际应用中,算法需要能够抵抗艺术抽象、周边物体以及精细目标类别之间细微差异的影响。现有的数据集提供了艺术品实例级别的注释,但通常偏向于图像中心,并且对于详细的对象类别的覆盖有限。为填补这一空白,提出了ODOR数据集,该数据集提供了38,116个对象级别注释,涵盖4712张图片,跨越了广泛的139种精细分类类别。通过统计分析,我们展示了该数据集具有的挑战性特性,如详细的分类集合、密集且重叠的对象以及在整个图像画布上的空间分布。 此外,我们还提供了针对目标检测模型的全面基线分析,并通过一系列辅助研究突出了数据集具有挑战性的属性。这个数据集旨在激发更多关于艺术品对象检测和更广泛的视觉文化遗产领域的研究,它鼓励研究人员探索物体识别与嗅觉感知之间的交集。
URL
https://arxiv.org/abs/2507.08384