AI-GENERATED TEXT DETECTOR FOR ARABIC LANGUAGE

dc.contributor.advisorElleithy, Khaled
dc.contributor.authorAlshammari, Hamed
dc.date.accessioned2024-08-28T13:01:21Z
dc.date.available2024-08-28T13:01:21Z
dc.date.issued2024-08
dc.description.abstractThe rise of AI-generated texts (AIGTs), particularly with the arrival of advanced language models like ChatGPT, has spurred a growing need for effective detection methods. While these models offer various beneficial applications, their potential for misuse, such as facilitating plagiarism and the generation of fake textual content, raises significant ethical concerns. These concerns have sparked extensive academic research into detecting AIGTs. Efforts to mitigate potential misuse include commercial platforms like Turnitin, GPTZero, and more. Notably, most evaluations conducted on the current AI detection thus far have predominantly focused on English or languages rooted in Latin-driven scripts. However, the effectiveness of existing AI detectors is notably hampered when processing Arabic texts due to the unique challenges posed by the language's diacritics, which are small marks placed above or below letters to indicate pronunciation. These diacritics can cause human-written texts (HWTs) to be misclassified as AIGTs. Recognizing the limitations of current detectors, this research first established a baseline performance assessment using a newly developed benchmark dataset of Arabic texts that contain HWTs and AIGTs against the existing detection systems such as OpenAI Text Classifier and GPTZero. This evaluation highlighted critical weaknesses in existing detectors' ability to handle diacritics and differentiate between HWTs and AIGTs, particularly in essay-length texts. This research introduces a novel AI text detector designed explicitly for Arabic to address these limitations, leveraging transformer-based pre-trained models trained on several novel datasets. Our resulting detector significantly outperforms the existing detection models in accurately identifying both HWTs and AIGTs in Arabic. Although the research focus was on Arabic due to its unique writing challenges, our detector architecture is adaptable to other languages.
dc.format.extent157
dc.identifier.urihttps://hdl.handle.net/20.500.14154/72971
dc.language.isoen_US
dc.publisherUniversity of Bridgeport
dc.subjectAI
dc.subjectAI-GENERATED TEXTS
dc.subjectARABIC DETECTOR
dc.subjectAI DETECTOR
dc.subjectSENTHATIC TEXTS DETECTOR
dc.titleAI-GENERATED TEXT DETECTOR FOR ARABIC LANGUAGE
dc.typeResearch Papers
sdl.degree.departmentComputer science
sdl.degree.disciplineArtificial Intelligence
sdl.degree.grantorUniversity of Bridgeport
sdl.degree.nameDoctor of Philosophy

Files

Copyright owned by the Saudi Digital Library (SDL) © 2025