Abstract
The Virginia Tech University Libraries (VTUL) Digital Library Platform (DLP) hosts digital collections that offer our users access to a wide variety of documents of historical and cultural importance. These collections are not only of academic importance but also provide our users with a glance at local historical events. Our DLP contains collections comprising digital objects featuring complex layouts, faded imagery, and hard-to-read handwritten text, which makes providing online access to these materials challenging. To address these issues, we integrate AI into our DLP workflow and convert the text in the digital objects into a machine-readable format. To enhance the user experience with our historical collections, we use custom AI agents for handwriting recognition, text extraction, and large language models (LLMs) for summarization. This poster highlights three collections focusing on handwritten letters, newspapers, and digitized topographic maps. We discuss the challenges with each collection and detail our approaches to address them. Our proposed methods aim to enhance the user experience by making the contents in these collections easier to search and navigate.
Abstract (translated)
弗吉尼亚理工大学图书馆(VTUL)的数字图书馆平台(DLP)托管了多种数字化藏品,为用户提供了访问具有历史和文化重要性的各种文档的机会。这些收藏不仅具有学术价值,还让用户能够了解本地的历史事件。我们的DLP包含了一系列复杂的数字对象,包括布局复杂、图像褪色以及难以辨认的手写文本等内容,这使得在线提供这些材料变得颇具挑战性。为解决这些问题,我们将在DLP工作流程中集成了人工智能,并将数字对象中的文字转换成机器可读的格式。为了增强用户对历史藏品的体验,我们使用了定制的人工智能代理进行手写识别、文本提取以及大型语言模型(LLMs)进行总结。此海报重点介绍了三个收藏项目:手写信件、报纸和数字化地形图。我们将讨论每个收藏项目的挑战,并详细说明我们的解决方法。所提出的方法旨在通过使这些藏品中的内容更易于搜索和导航,从而增强用户体验。
URL
https://arxiv.org/abs/2411.17600