Saudi Cultural Missions Theses & Dissertations
Permanent URI for this communityhttps://drepo.sdl.edu.sa/handle/20.500.14154/10
Browse
23 results
Search Results
Item Restricted LARGE LANGUAGE MODELS FOR TEST CODE COMPREHENSION(Saudi Digital Library, 2025) Aljohani, Ahmed; Hyunsook, DoContext: Unit testing is critical for software reliability, yet challenges persist in ensuring the quality, comprehensibility, and maintainability of test code. Transformer/LLM-based approaches have improved the readability of generated tests, but they also introduce new risks and omissions. Empirically, transformer-based test generation (e.g., AthenaTest) produces tests with nontrivial smell rates, mirroring issues present in the training corpus (e.g., Methods2Test). In parallel, assertion messages—essential for debugging and failure interpretation—are often missing or generic in both developer- and LLM-generated tests, and test-level summaries frequently omit the core predicate unless test-aware structure is made explicit. Beyond code artifacts, the rise of LLM APIs has created new forms of Self-Admitted Technical Debt (SATD) centered on prompt design, hyperparameter configurations, and framework orchestration. Objective: This dissertation systematically investigates (i) how transformer/LLM pipelines affect test quality (with a focus on test smells), (ii) whether and how LLMs can be guided to produce useful assertion messages and concise, faithful test summaries, and (iii) what LLM-specific SATD emerges in real projects and how to manage it. The overarching goal is to improve test comprehensibility and maintainability while establishing practices that reduce long-term technical debt in LLM-enabled development. Method: We conduct three complementary empirical studies. First, we analyze transformer-generated tests (AthenaTest) against a curated set of test smells, then trace likely causes to two factors: properties learned from the Methods2Test training corpus and the model’s design tendencies (e.g., assertion density and test size). Second, we evaluate the contribution of lightweight documentation to comprehension through two test-aware tasks. For assertion messages, we benchmark four FIM-style code LLMs on a dataset of 216 Java test methods where developer-written messages serve as ground truth, comparing model outputs to human messages with semantic and human-like scoring. For test code summarization, we introduce a benchmark of 91 Java unit tests paired with developer-written comments and run an ablation over seven prompt variants—varying test code, MUT, assertion messages, and assertion semantics—across four code LLMs, assessed with BLEU, METEOR, ROUGE-L, BERTScore, and an LLM-as-a-judge rubric. Third, we study maintainability at the application layer by mining and classifying Self-Admitted Technical Debt (SATD) in LLM-based projects, identifying LLM-specific categories (prompt debt, hyperparameter debt, framework debt, cost debt, and learning debt) and quantifying which prompt techniques (e.g., instruction-first, few-shot) accrue the most debt in practice.13 0Item Restricted Lexicography in NLP: A Study on the Interaction Between Lexical Resources and Large Language Models(Saudi Digital Library, 2025) Almeman, Fatemah; Espinosa-Anke, LuisThis thesis explores the interaction between lexical resources (LRs) and large language models (LLMs) in the context of natural language processing, focusing on the evaluation of WordNet (WN)—the de facto lexical database for English—along with the development of a new dataset and a novel reverse dictionary (RD) method. The investigation starts with an assessment of WN, particularly its examples, both intrinsically and extrinsically, compared to other resources using the Good Dictionary EXamples (GDEX) framework. This evaluation shows that WN’s examples are often limited in length and informativeness. In an extrinsic analysis, we examined WN’s performance in definition modeling and word similarity tasks, where informative contextual representations are essential. Results indicate that LLM-generated examples are more informative than those from WN. To overcome limitations in LRs (some uncovered by our analysis), we then introduce a new dataset called 3D-EX providing terms, definitions, and usage examples. It integrates entries from ten diverse English dictionaries and encyclopedias with varying linguistic styles. We conducted intrinsic experiments on source classification, predicting the origin of a instance, and RD, which retrieves a ranked list of terms from a definition. Results indicate that 3D-EX enhances performance in both tasks, highlighting its usefulness for NLP. This thesis further explores RD by introducing GEAR, a lightweight and unsupervised approach to RD tasks. GEAR operates through four stages: Generate, Embed, Average,and Rank. It was evaluated using the Hill dataset, a leading benchmark for RD tasks, and it consistently outperformed existing methods. In conclusion, this thesis investigates how LLMs and LRs can benefit each other. We identified limitations in some resources and found that LLMs are a suitable tool for addressing them. Additionally, LLMs can automatically improve language resources by unifying them with different anchors.14 0Item Restricted Leveraging Social Media Data for Detection and Monitoring of Depression(Saudi Digital Library, 2025) Alhamed, Falwah Abdulaziz; Specia, Lucia; Specia, LuciaMental health disorders are increasingly prevalent, with depression being the most common and a significant cause of disability and suicide worldwide. Understanding its symptoms, severity, and progression is vital for improving early detection and intervention. This thesis adopts a data-driven AI approach, constructing a large, expert-annotated dataset and developing models to monitor depression from social media language. We first design a data collection and curation framework to build a large-scale dataset of posts from individuals who self-report depression. In collaboration with psychiatrists and psychologists, we create an annotation scheme for labelling symptoms and severity over time. Experienced psychologists annotate the data, resulting in DepSy, the largest English dataset of 40,000 posts fully annotated for depression symptoms and severity progression. This dataset underpins all subsequent experiments. We then benchmark multiple NLP approaches to classify posts written before versus after a reported depression diagnosis. Analyses include linguistic patterns, emotion usage, and content variation. Among the models tested, BERT-based classifiers achieve the best overall performance, while large language models (LLMs) in zero-shot settings perform near-randomly. Next, we address symptom detection as a multi-label classification problem. A bespoke BERT-based model achieves strong overall results, while a fine-tuned Llama-based model, DepSy-LLaMA, obtains higher recall, identifying more positive symptom cases—a valuable property in mental health detection. However, LLM predictions remain less reliable for sensitive applications. Finally, we explore the prediction of depression severity over time using deep learning and propose a hybrid CTMC-LSTM model that integrates Markov chains with LSTM to capture temporal patterns. This model uniquely detects severe cases and achieves the highest performance across all baselines. The findings demonstrate the importance of temporal modelling and expert-annotated data for building robust, ethical, and clinically informed systems for depression monitoring from social media.12 0Item Restricted Optimizing Hate Text Detection using Custom NLP Techniques and an Adapted DeBERTa-based Machine Learning Model(Saudi Digital Library, 2025) Aljabbar, Abdullah; AlYamani, AbdulghaniThe rapid expansion of social media has transformed online communication, providing platforms for public debate and community engagement. However, this openness has also facilitated the spread of harmful content, particularly hate speech, which poses significant risks to individual well-being, social cohesion, and digital trust. Detecting such content remains a major challenge due to the subtle, context-dependent, and evolving nature of hateful expressions. Traditional machine learning models, though useful as early baselines, often fail to capture linguistic nuance and contextual depth. Recent advances in natural language processing (NLP), particularly Transformer-based architectures, have significantly improved text classification tasks by enabling context-sensitive embeddings. This research investigates the effectiveness of DeBERTa (Decoding-enhanced BERT with Disentangled Attention) for hate speech detection. The study employs a systematic methodology consisting of four stages: data preparation and preprocessing, exploratory data analysis, model development, and evaluation. A curated dataset of 2,041 social media posts, derived from a larger corpus, was pre-processed to remove noise, normalise text, and correct class imbalance. The DeBERTa-v3-large model was fine-tuned using cross-entropy loss and AdamW optimisation. Performance was assessed with accuracy, precision, recall, F1-score, ROC, and PR curves, while error analysis and confusion matrices were used to identify common misclassifications. The findings demonstrate that DeBERTa can effectively capture indirect meaning and grammar connections. Additionally, outperforming traditional approaches and offering robust classification of hate and non- hate content. The study contributes to both NLP research and the wider cybersecurity domain by supporting the development of more reliable automated moderation tools that promote safer digital environments.14 0Item Restricted Computational Approaches for Drug Repositioning and Target Discovery in Alzheimer’s Disease(King Abdullah University of Science and Technology (KAUST), 2024) Alamro, Hind; Gao, XinAlzheimer’s Disease (AD) presents significant challenges to global healthcare systems due to its complex and progressive nature. Despite extensive research, the underlying mechanisms of AD lack clarity, and current treatments only alleviate symptoms without halting disease progression. Consequently, there is an urgent need for computational approaches that can accelerate research efforts and aid in the development of more effective treatments for AD. In this thesis, we address these critical challenges by developing computational and AI-based methods to improve the early detection of AD, identify novel biomarkers, and explore new therapeutic strategies through drug repositioning. To begin with, we focus on identifying key biomarkers associated with AD using gene expression datasets and then expand it to the identification of biomarkers through exploring the association between AD and its comorbidity, resulting in the discovery of new hub genes and miRNAs. Next, we examine the potential for drug repositioning by mining biomedical literature to uncover associations between drugs, targets, and diseases. This task was fulfilled by developing a systematic pipeline to extract valuable information from a curated collection of AD-related literature. The resulting data is subsequently used to construct a disease-specific knowledge graph, which is employed for drug repositioning using advanced graph-based techniques. Overall, this thesis contributes to AD research by employing computational methods, multi-data integration, and literature mining to provide new insights and therapeutic strategies. This work identifies key participants in AD progression and presents a pathway to accelerate the discovery of treatments through computational approaches.10 0Item Restricted Enhancing Cross-lingual Transfer Learning for Crisis Text Classification on Social Media(Saudi Digital Library, 2025) AlAmer, Shareefa; Lee, Mark; Smith, PhillipDuring crisis events such as natural disasters, conflicts, and pandemics, social media platforms serve as vital channels for real-time information sharing. These platforms enable users to post urgent updates, request assistance, and disseminate situational awareness at a scale and speed that traditional communication systems cannot match. Automatically classifying user-generated content in these contexts is essential for supporting timely emergency response. However, performing this task across multiple languages remains a major challenge, especially given that such content is often noisy, informal, and linguistically diverse. One of the core challenges lies in the scarcity of annotated data that is both domain- and task-specific. Even widely spoken languages may lack labelled resources tailored to specific applications, effectively rendering them low-resource for those tasks. Existing solutions for cross-lingual transfer remain suboptimal even when applied to more structured and formal data using complex architectures, which further highlights their limitations when handling the noisier and less predictable nature of social media content. In response to these challenges, this thesis investigates practical and scalable solutions to improve the cross-lingual classification of crisis-related social media content. The research explores four key directions: (1) evaluating Machine Translation as a strategy for augmenting training data in low-resource languages; (3) applying ensemble learning to enhance robustness across multilingual inputs; (3) examining data balancing methods to mitigate class imbalance; and (4) analysing interlingual transfer dynamics to identify how languages interact in multilingual learning setups. The evaluation of the proposed approach is performed through extensive experimentation on a real-world dataset of crisis-related X-posts (formerly known as tweets). The proposed methods achieve competitive results despite challenges posed by noisy social media text, class imbalance, and the lack of annotated data. This work presents a generalisable framework for multilingual crisis classification and offers insights that are valuable for real-world applications where language diversity and data scarcity are critical factors. The effectiveness of the proposed system can be further enriched by incorporating a wider range of languages and leveraging more advanced analytical models. Additionally, adopting advanced translation techniques could also be explored for even greater impact in future crisis response systems.38 0Item Restricted Paraphrase Generation and Identification at Paragraph-Level(Saudi Digital Library, 2025) Alsaqaabi, Arwa; Stewart, Craig; Akrida, Eleni; Cristea, AlexandraThe widespread availability of the Internet and the ease of accessing written content have significantly contributed to the rising incidence of plagiarism across various domains, including education. This behaviour directly undermines academic integrity, as evidenced by reports highlighting increased plagiarism in student work. Notably, students tend to plagiarize entire paragraphs more often than individual sentences, further complicating efforts to detect and prevent academic dishonesty. Additionally, advancements in natural language processing (NLP) have further facilitated plagiarism, particularly by using online paraphrasing tools and deep-learning language models designed to generate paraphrased text. These developments underscore the critical need to develop and refine effective paraphrase identification (PI) methodologies. This thesis addresses one of the most challenging aspects of plagiarism detection (PD): identifying instances of plagiarism at the paragraph-level, with a particular emphasis on paraphrased paragraphs rather than individual sentences. By focusing on this level of granularity, the approach considers both intra-sentence and inter-sentence relationships, offering a more comprehensive solution to the detection of sophisticated forms of plagiarism. To achieve this aim, the research examines the influence of text length on the performance of NLP machine learning (ML) and deep learning (DL) models. Furthermore, it introduces ALECS-SS (ALECS – Social Sciences), a large-scale dataset of paragraph-length paraphrases, and develops three novel SALAC algorithms designed to preserve semantic integrity while restructuring paragraph content. These algorithms suggest a novel approach that modifies the structure of paragraphs while maintaining their semantics. The methodology involves converting text into a graph where each node corresponds to a sentence’s semantic vector, and each edge is weighted by a numerical value representing the sentence order probability. Subsequently, a masking approach is applied to the reconstructed paragraphs modifying the v lexical elements while preserving the original semantic content. This step introduces variability to the dataset while maintaining its core meaning, effectively simulating paraphrased text. Human and automatic evaluations assess the reliability and quality of paraphrases, and additional studies examine the adaptability of SALAC across multiple academic domains. Moreover, state-of-the-art large language models (LLMs) are analysed for their ability to differentiate between human-written and machine-paraphrased text. This investigation involves the use of multiple PI datasets in addition to the newly established paragraph-level paraphrases dataset (ALECS-SS). The findings demonstrate that text length significantly affects model performance, with limitations arising from dataset segmentation. Additionally, the results show that the SALAC algorithms effectively maintain semantic integrity and coherence across different domains, highlighting their potential for domain-independent paraphrasing. The thesis also analysed the state-of-the-art LLMs’ performance in detecting auto-paraphrased content and distinguishing them from human-written content at both the sentence and paragraph levels, showing that the models could reliably identify reworded content from individual sentences up to entire paragraphs. Collectively, these findings contribute to educational applications and plagiarism detection by improving how paraphrased content is generated and recognized, and they advance NLP-driven paraphrasing techniques by providing strategies that ensure that meaning and coherence are preserved in reworded material.20 0Item Restricted Embracing Emojis in Sarcasm Detection to Enhance Sentiment Analysis(University of Southampton, 2025) Alsabban, Malak Abdullah; Hall, Wendy; Weal, MarkPeople frequently share their ideas, concerns, and emotions on social networks, making sentiment analysis on social media increasingly important for understanding public opinion and user sentiment. Sentiment analysis provides an effective means of interpreting people's attitudes towards various topics, individuals, or ideas. This thesis introduces the creation of an Emoji Dictionary (ED) to harness the rich contextual information conveyed by emojis. It acts as a valuable resource for deciphering the emotional nuances embedded in textual content, contributing to a deeper understanding of sentiment. In addition, the research explores the complex domain of sarcasm detection by proposing a novel Sarcasm Detection Approach (SDA). This approach identifies sarcasm by analysing conflicts between textual content and the accompanying emojis. The thesis addresses key challenges in sentiment analysis by evaluating and comparing emoji dictionaries and sarcasm detection approaches to enhance sentiment classification. Extensive experimentation on diverse datasets rigorously assesses the effectiveness of these methods in improving sentiment analysis accuracy and sarcasm detection performance, particularly in emoji-rich datasets. The findings highlight the crucial role of emojis as contextual cues, underscoring their value in sentiment analysis and sarcasm detection tasks. The outcomes of this thesis aim to advance sentiment analysis methodologies by offering insights into preprocessing strategies, leveraging the expressive potential of emojis through the Emoji Dictionary (ED), and introducing the Sarcasm Detection Approach (SDA). The research demonstrates that integrating emojis through these tools substantially enhances both sentiment analysis and sarcasm detection. By utilizing these tools, the study not only improves model performance but also opens avenues for further exploration into the nuanced complexities of digital communication.24 0Item Restricted Disinformation Classification Using Transformer based Machine Learning(Howard University, 2024) alshaqi, Mohammed Al; Rawat, Danda BThe proliferation of false information via social media has become an increasingly pressing problem. Digital means of communication and social media platforms facilitate the rapid spread of disinformation, which calls for the development of advanced techniques for identifying incorrect information. This dissertation endeavors to devise effective multimodal techniques for identifying fraudulent news, considering the noteworthy influence that deceptive stories have on society. The study proposes and evaluates multiple approaches, starting with a transformer-based model that uses word embeddings for accurate text classification. This model significantly outperforms baseline methods such as hybrid CNN and RNN, achieving higher accuracy. The dissertation also introduces a novel BERT-powered multimodal approach to fake news detection, combining textual data with extracted text from images to improve accuracy. By lever aging the strengths of the BERT-base-uncased model for text processing and integrating it with image text extraction via OCR, this approach calculates a confidence score indicating the likeli hood of news being real or fake. Rigorous training and evaluation show significant improvements in performance compared to state-of-the-art methods. Furthermore, the study explores the complexities of multimodal fake news detection, integrat ing text, images, and videos into a unified framework. By employing BERT for textual analysis and CNN for visual data, the multimodal approach demonstrates superior performance over traditional models in handling multiple media formats. Comprehensive evaluations using datasets such as ISOT and MediaEval 2016 confirm the robustness and adaptability of these methods in combating the spread of fake news. This dissertation contributes valuable insights to fake news detection, highlighting the effec tiveness of transformer-based models, emotion-aware classifiers, and multimodal frameworks. The findings provide robust solutions for detecting misinformation across diverse platforms and data types, offering a path forward for future research in this critical area.34 0Item Restricted IS THE METAVERSEFAILING? ANALYSINGSENTIMENTS TOWARDSTHEMETAVERSE(The University of Manchester, 2024) Alharbi, Manal Dowaihi; Batista-navarro, RizaThis dissertation investigates Aspect-Based Sentiment Analysis (ABSA) within the context of the Metaverse to better understand opinions on this emerging digital environment, particularly from a news perspective. The Metaverse, a virtual space where users can engage in various experiences, has attracted both positive and negative opinions, making it crucial to explore these sentiments to gain insights into public perspectives. A novel dataset of news articles related to the Metaverse was created, and Target Aspect-Sentiment Detection (TASD) models were applied to analyze sentiments ex pressed toward various aspects of the Metaverse, such as device performance and user privacy. A key contribution of this research is the evaluation of the TASD architecture, TAS-BERT, and its enhanced version, Advanced TAS-BERT (ATAS-BERT), which performs each task separately, on two datasets: the newly created Metaverse dataset and the SemEval15 Restaurant dataset. They were tested with different Transformer based models, including BERT, DeBERTa, RoBERTa, and ALBERT, to assess performance, particularly in cases where the target is implicit. The findings demonstrate the ability of advanced Transformer models to handle complex tasks, even when the target is implicit. ALBERT performed well on the simpler Metaverse dataset, while DeBERTa and RoBERTa showed superior performance on both datasets. This dissertation also suggests several areas for improvement in future research, such as processing paragraphs instead of individual sentences, utilizing Meta AI models for dataset annotation to enhance accuracy, and designing architectures specifically for models like DeBERTa, RoBERTa, and ALBERT, rather than relying on architectures originally designed for BERT, to improve performance. Additionally, incorporating enriched context representations, such as Part-of-Speech tags, could further enhance model performance.18 0
- «
- 1 (current)
- 2
- 3
- »
