Automatic Essay Scoring in Arabic: Development, Evaluation, and Advanced Techniques

dc.contributor.advisorSimpson, Edwin
dc.contributor.authorGhazawi, Rayed
dc.date.accessioned2025-10-05T04:47:15Z
dc.date.issued2025
dc.description.abstractAutomated Essay Scoring (AES) has advanced considerably due to recent progress in natural language processing (NLP). This thesis examines key challenges in AES, with a particular focus on the Arabic language, and proposes practical approaches informed by both computational techniques and educational theory. First, the research investigates how the formulation of essay questions affects the accuracy of automated scoring systems. A set of question-design criteria, derived from educational principles, is introduced and empirically tested. Experiments show that adherence to these criteria can significantly improve AES performance, with improvements of up to 40% observed using BERT-based models for English essays. Given the limited resources for Arabic AES, this thesis introduces the AR-AES dataset, consisting of 2046 essays from undergraduate students across multiple courses, annotated independently by two university instructors. This resource alleviates the scarcity of Arabic-language datasets for AES, supporting model development and evaluation. Experimental analyses using pretrained Arabic NLP models demonstrate that transformer-based approaches achieve the highest levels of agreement with human scores. In many cases, their predictions show greater consistency with the gold scores than the agreement observed between the human annotators themselves. This high level of agreement with human scores indicates that, under appropriate conditions, the proposed AES system may be suitable for assisting human markers in real-world educational settings. Additionally, the thesis explores the potential of large language models (LLMs), including ChatGPT, Llama, Aya, Jais, and ACEGPT for Arabic AES. Experiments with different training approaches, zero-shot, few-shot, and fine-tuning, demonstrate the importance of prompt engineering. A mixed-language prompting strategy, combining Arabic essays with English scoring guidelines, was found to notably enhance model performance. Nonetheless, fine-tuned AraBERT consistently yielded the strongest results, indicating that LLMs may not yet be the most effective option for Arabic AES tasks when training data is limited. Finally, an active learning framework is introduced, integrating AraBERT with uncertainty- and diversity-based sampling strategies. This human-in-the-loop approach prioritises essays that most benefit from expert review, reducing the need for extensive manual annotation while preserving high-scoring accuracy. Rather than replacing human markers, the system complements their efforts, offering a more efficient and consistent approach to large-scale essay evaluation. Overall, this thesis advances AES by introducing explicit criteria for effective essay question design, while also addressing specific challenges in Arabic AES. It contributes a comprehensively annotated dataset, presents a systematic evaluation of state-of-the-art NLP models, and effectively integrates active learning to balance automated scoring accuracy and human involvement.
dc.format.extent284
dc.identifier.urihttps://hdl.handle.net/20.500.14154/76538
dc.language.isoen
dc.publisherUniversity of Bristol
dc.subjectAutomated Essay Scoring (AES)
dc.subjectArabic Dataset
dc.subjectArabic
dc.subjectAraBERT
dc.subjectAI
dc.subjectNatural Language Processing (NLP)
dc.titleAutomatic Essay Scoring in Arabic: Development, Evaluation, and Advanced Techniques
dc.typeThesis
sdl.degree.departmentComputer Science
sdl.degree.disciplineArtificial Intelligence: Natrual Language Processing NLP
sdl.degree.grantorUniversity of Bristol
sdl.degree.nameDoctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
SACM-Dissertation.pdf
Size:
3.92 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed to upon submission
Description:

Copyright owned by the Saudi Digital Library (SDL) © 2025