Real-word error detection and correction in Arabic text

dc.contributor.authorMajed Mohammed Abdulqader Al-Jefri
dc.date2013
dc.date.accessioned2022-05-18T06:03:41Z
dc.date.available2022-05-18T06:03:41Z
dc.degree.departmentCollege of Computer Science and Engineering
dc.degree.grantorKing Fahad for Petrolem University
dc.description.abstractSpell checking is the process of finding misspelled words and possibly correcting them. Spell checkers are important tools for document preparation, word processing, searching, and document retrieval. The task of detecting and correcting misspelled words in a text is challenging. Most of the modern commercial spell checkers work on word level with the possibility of detecting and correcting non-word errors. However, few of them use techniques to work on real-word errors. This is one of the challenging problems in text processing. Moreover, most of the proposed techniques so far are on Latin script languages. However, Arabic language has not received much interest, especially for real-word errors. In this thesis we address the problem of real-word errors using context words and n-gram language models. We implemented an unsupervised model for real-word error detection and correction for Arabic text in which N-gram language models are used. Supervised models are also implemented that use confusion sets to detect and correct real-word errors. In the supervised models, a window based technique is used to estimate the probabilities of the context words of the confusion sets. N-gram language models are also used to detect real-word errors by examining the sequences of n words. The same language models are also used to choose the best correction for the detected errors. The experimental results of the prototypes showed promising correction accuracy. However, it is not possible to compare our results with other published works as there is no benchmarking dataset for real-word errors correction for Arabic text. In addition, conclusions and future directions are also presented.
dc.identifier.other6017
dc.identifier.urihttps://drepo.sdl.edu.sa/handle/20.500.14154/2151
dc.language.isoen
dc.publisherSaudi Digital Library
dc.thesis.levelMaster
dc.thesis.sourceKing Fahad for Petrolem University
dc.titleReal-word error detection and correction in Arabic text
dc.typeThesis

Files

Copyright owned by the Saudi Digital Library (SDL) © 2025