Deep Learning for Handwritten Text Recognition in Historical Documents

Alrasheed, Nouf

Deep Learning for Handwritten Text Recognition in Historical Documents

Date

2023-08

Authors

Alrasheed, Nouf

Abstract

Recognizing handwritten text in historical documents using deep learning techniques poses significant challenges in the document analysis and recognition domain. Factors such as the variability and complexity of handwriting styles, poor document quality, and the presence of rare or ancient words and phrases contribute to these challenges. Additionally, the scarcity of labeled datasets for training deep learning models further compounds the difficulty in this field. This dissertation proposes deep learning-based approaches to address these challenges by recognizing and retrieving information from handwritten seventeenth-century Spanish American notary records. The approaches focus on three key features: 1) Spanish character recognition in the notary records, 2) employing an end-to-end object detection model for word detection and recognition in a single notary's handwriting, and 3) introducing a novel few-shot learning approach for word recognition across multiple notaries' handwriting. We aim to make the task of generating extensive training data for deep learning models less difficult and expensive. To evaluate the proposed approaches, we conducted experiments on a dataset of 141 double-page Spanish American notary records containing 40,000 Spanish words that were labeled by domain experts. The performance of our approach surpassed existing word recognition solutions, demonstrating higher accuracy and F1 score. This work contributes to making handwritten seventeenth-century Spanish American notary records more accessible to researchers without extensive paleography training.

Keywords

Deep Learning

URI

https://hdl.handle.net/20.500.14154/68859

Collections

SACM - United States of America

Full item page

Deep Learning for Handwritten Text Recognition in Historical Documents

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By