SACM - United Kingdom
Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9667
Browse
1 results
Search Results
Item Restricted Rasm: Arabic Handwritten Character Recognition: A Data Quality Approach(University of Essex, 2024) Alghamdi, Tawfeeq; Doctor, FaiyazThe problem of AHCR is a challenging one due to the complexities of the Arabic script, and the variability in handwriting (especially for children). In this context, we present ‘Rasm’, a data quality approach that can significantly improve the result of AHCR problem, through a combination of preprocessing, augmentation, and filtering techniques. We use the Hijja dataset, which consists of samples from children from age 7 to age 12, and by applying advanced preprocessing steps and label-specific targeted augmentation, we achieve a significant improvement of a CNN performance from 85% to 96%. The key contribution of this work is to shed light on the importance of data quality for handwriting recognition. Despite the recent advances in deep learning, our result reveals the critical role of data quality in this task. The data-centric approach proposed in this work can be useful for other recognition tasks, and other languages in the future. We believe that this work has an important implication on improving AHCR systems for an educational context, where the variability in handwriting is high. Future work can extend the proposed techniques to other scripts and recognition tasks, to further improve the optical character recognition field.42 0