Saudi Cultural Missions Theses & Dissertations

Permanent URI for this communityhttps://drepo.sdl.edu.sa/handle/20.500.14154/10

Browse

Search Results

Now showing 1 - 10 of 13
  • ItemRestricted
    Cross-Lingual Transfer Learning for Arabic Sentiment Analysis
    (Saudi Digital Library, 2025) Bin Owayn, Najd Mohammed; Lauria, Stasha
    This dissertation presents a comprehensive investigation into the efficacy of cross-lingual transfer learning for Arabic sentiment analysis within low-resource contexts. The study rigorously compares the performance of a multilingual transformer model, XLM-RoBERTa (XLM-R), against a monolingual Arabic-specific model, CAMeLBERT, under varying data availability conditions, specifically zero-shot and few-shot learning paradigms. The primary objective is to identify the most effective and efficient modeling approach for accurate sentiment analysis when only limited Arabic training data is accessible. The research addresses the inherent challenges of Arabic sentiment analysis, including its complex morphology, pervasive dialectal variations, and the scarcity of large, annotated datasets. Utilizing a publicly available Arabic Company Reviews Dataset, the study systematically evaluates model performance across incrementally increasing amounts of labeled data: zero-shot application, and fine-tuning with 100, 500, and 1000 samples. This controlled experimental design allows for a direct, data-driven comparison of the models' efficiency and effectiveness. Key findings demonstrate that XLM-R exhibits remarkable zero-shot capabilities, achieving an accuracy of 0.829 and an Area Under the Curve (AUC) of 0.921 even without any direct fine-tuning on Arabic sentiment data. This underscores the power of large-scale multilingual pre-training in fostering language-agnostic sentiment understanding. With the introduction of limited Arabic labeled data, XLM-R's performance further improved, reaching an accuracy of 0.886 and an AUC of 0.942 with 1000 samples. The most substantial performance gains for XLM-R were observed during the initial stages of few-shot fine-tuning, highlighting its high data efficiency. In contrast, CAMeLBERT, designed as a monolingual Arabic model, showed poor zero-shot performance (accuracy 0.275, AUC 0.522), as anticipated due to its specialisation in Arabic linguistic structures rather than cross-lingual transfer. However, CAMeLBERT demonstrated exceptional adaptability and rapid improvement with few-shot fine-tuning. With a mere 100 labeled Arabic samples, its accuracy dramatically surged to 0.814 and AUC to 0.913. Its performance continued to improve, eventually approaching XLM-R's levels at 1000 samples (accuracy 0.868, AUC 0.936). This indicates that while monolingual models necessitate some target-language data to become effective, they can quickly leverage their deep linguistic understanding of Arabic to achieve competitive results. Learning curve analysis revealed that for both models, the most significant performance improvements occurred between the zero-shot and 100-sample conditions, with diminishing returns observed as the training data size increased further. This finding is crucial for practitioners, suggesting that a relatively small investment in data annotation can yield substantial performance gains, while further extensive annotation may offer only marginal improvements. In conclusion, this dissertation provides a data-driven cost-benefit analysis for practitioners navigating Arabic sentiment analysis in resource-constrained environments. It demonstrates that while monolingual models like CAMeLBERT can achieve competitive performance with modest amounts of labeled Arabic data, multilingual models like XLM-R offer a superior starting point with strong zero-shot capabilities and maintain a statistically significant edge even with limited fine-tuning data. This research contributes to a more nuanced understanding of the practical utility of cross-lingual transfer learning, advocating for its strategic adoption in scenarios where extensive Arabic data annotation is not feasible. Future work includes investigating domain-specific pre-training, exploring advanced few-shot learning techniques, and incorporating explicit dialectal Arabic analysis.
    8 0
  • ItemRestricted
    Sentiment-Driven Health Messaging: Comparing DistilBERT and VADER on Timed Narratives with Animated Educational Videos using Manim
    (Saudi Digital Library, 2025) Alshabanah, Alozuf Mohammed; Ansari, Tayyab Ahmad
    Abstract— This project builds two educational animations using Manim and explores whether general-purpose sentiment analysis tools can help estimate the “healthiness” of child-friendly, short health education videos, by defining a script with timestamped segments labelled as Healthy or Unhealthy. The focus is on comparing a transformer-based sentiment classifier (DistilBERT SST-2) with a rule-based tool (VADER). DistilBERT links POSITIVE sentiment to Healthy and NEGATIVE to Unhealthy. VADER uses a score threshold (≥0.05 for Healthy, ≤−0.05 for Unhealthy, and values in between as Neutral). For evaluation, the “Neutral” predictions are converted to “Unhealthy”. Precision, recall, and F1 scores (per video and overall) are calculated, and confusion matrices are used to visualize the performance. In the custom dataset of 24 segments, both methods perform well when the language is clearly positive or negative. DistilBERT rarely produces Neutral labels but sometimes misinterprets healthy advice that includes negation. VADER, However, often predicts Neutral, which is penalized in the evaluation method. The study also discusses the limitations of using sentiment as a proxy for healthfulness, including the effects of class imbalance and how Neutral predictions are handled. Few potential improvements are suggested, such as domain specific model tuning, more accurate threshold settings, and treating Neutral as an undecided, or “abstain” option.
    20 0
  • ItemRestricted
    Cross Dataset Fairness Evaluation of Transformer Based Sentiment Models
    (Saudi Digital Library, 2025-05-10) Zuiran, Sara; Bhattacharyya, Siddhartha
    With the growing exploration of Natural Language Processing (NLP) systems in decision-making environments, it is essential to evaluate technical and ethical aspects of the dataset and the NLP model to improve fairness. To assess fairness, the thesis examines demographic imbalances in sentiment classification models by evaluating transformer-based models fine-tuned on the Stanford Sentiment Treebank version 2 dataset (SST-2) against the demographically annotated Comprehensive Assessment of Language Model dataset (CALM). This work identifies performance disparities in sentiment prediction across demographic groups by examining sensitive attributes such as gender and race. The study evaluates both the RoBERTa and MentalBERT transformer models using a complete set of fairness metrics consisting of Statistical Parity Difference (SPD), Equal Opportunity Difference (EOD), False Positive Rates (FPR), False Negative Rates (FNR), Jensen-Shannon Divergence (JSD), and Wasserstein Distance (WD). The analysis examines both group-vs-rest and pairwise subgroup comparisons, including gender and ethnicity. Results show that applying adversarial mitigation reduced fairness disparities across demographic subgroups, with the most notable improvements observed for non-binary and Asian users. The observed disparities emphasize the challenge of reducing performance gaps across demographic subgroups in sentiment classification tasks. The thesis introduces a practical framework for evaluating demographic dis- disparities, extends fairness analysis, and assesses the impact of mitigation techniques in cross-dataset sentiment classification. This research proposes a framework that demonstrates a path toward creating inclusive NLP systems and establishes the groundwork for upcoming ethical Artificial Intelligence (AI) studies.
    23 0
  • ItemRestricted
    Evaluating Chess Moves by Analysing Sentiments in Teaching Textbooks
    (the University of Manchester, 2025) Alrdahi, Haifa Saleh T; Batista-navarro, Riza
    The rules of playing chess are simple to comprehend, and yet it is challenging to make accurate decisions in the game. Hence, chess lends itself well to the development of an artificial intelligence (AI) system that simulates real-life problems, such as in decision-making processes. Learning chess strategies has been widely investigated, with most studies focused on learning from previous games using search algorithms. Chess textbooks encapsulate grandmaster knowledge, which explains playing strategies. This thesis investigates three research questions on the possibility of unlocking hidden knowledge in chess teaching textbooks. Firstly, we contribute to the chess domain with a new heterogeneous chess dataset “LEAP”, consists of structured data that represents the environment “board state”, and unstructured data that represent explanation of strategic moves. Additionally, we build a larger unstructured synthetic chess dataset to improve large language models familiarity with the chess teaching context. With the LEAP dataset, we examined the characteristics of chess teaching textbooks and the challenges of using such a data source for training Natural Language (NL)-based chess agent. We show by empirical experiments that following the common approach of sentence-level evaluation of moves are not insightful. Secondly, we observed that chess teaching textbooks are focused on explanation of the move’s outcome for both players alongside discussing multiple moves in one sentence, which confused the models in move evaluation. To address this, we introduce an auxiliary task by using verb phrase-level to evaluate the individual moves. Furthermore, we show by empirical experiments the usefulness of adopting the Aspect-based Sentiment Analysis (ABSA) approach as an evaluation method of chess moves expressed in free-text. With this, we have developed a fine-grained annotation and a small-scale dataset for the chess-ABSA domain “ASSESS”. Finally we examined the performance of a fine-tuned LLM encoder model for chess-ABSA and showed that the performance of the model for evaluating chess moves is comparable to scores obtained from a chess engine, Stockfish. Thirdly, we developed an instruction-based explanation framework, using prompt engineering with zero-shot learning to generate an explanation text of the move outcome. The framework also used a chess ABSA decoder model that uses an instructions format and evaluated its performance on the ASSESS dataset, which shows an overall improvement performance. Finally, we evaluate the performance of the framework and discuss the possibilities and current challenges of generating large-scale unstructured data for the chess, and the effect on the chess-ABSA decoder model.
    12 0
  • ItemRestricted
    Analyzing the Impact of Economic Policy Uncertainty and Investor Sentiment on Stock Market Dynamics (Returns & Volatility)
    (University of Liverpool, 2024-09-12) Alahmare, Reem; Hizmeri Canales, Rodrigo
    This dissertation investigates the joint effects of Economic Policy Uncertainty (EPU) and investor sentiment on stock market dynamics, particularly focusing on the S&P 500 index. The study integrates sentiment analysis from real-time news and social media data with EPU indices to develop predictive models for stock returns and volatility over a 10-year period (2013-2023). By employing econometric techniques, such as LASSO regression, Ordinary Least Squares (OLS) regression, and GARCH models, the study aims to provide a more comprehensive understanding of how these psychological and macroeconomic factors influence market behavior. The findings highlight that investor sentiment plays a stabilizing role in periods of positive sentiment, reducing market volatility and enhancing stock returns. In contrast, negative sentiment amplifies volatility, especially when combined with high levels of policy uncertainty. EPU, particularly as measured by the News-Based Policy Uncertainty Index, emerges as a critical driver of volatility, affecting market stability during periods of fiscal and trade policy uncertainty. The interaction between sentiment and EPU is shown to provide better predictive accuracy for stock market behavior compared to traditional financial models. The research contributes to the growing body of literature by developing models that integrate real-time sentiment data with EPU, offering more nuanced insights into stock market volatility and returns. The practical implications are significant for both investors and policymakers, providing tools to improve risk management and decision-making. Investors are advised to consider sentiment and policy uncertainty together when assessing market risks, while policymakers are encouraged to ensure transparent communication to minimize uncertainty and stabilize markets. This study advances our understanding of the roles of sentiment and policy uncertainty in financial markets, highlighting their combined influence on stock market volatility and returns, and offering practical strategies for navigating periods of economic uncertainty.
    19 0
  • ItemRestricted
    Developing a Generative AI Model to Enhance Sentiment Analysis for the Saudi Dialect
    (Texas Tech University, 2024-12) Aftan, Sulaiman; Zhuang, Yu
    Sentiment Analysis (SA) is a fundamental task in Natural Language Processing (NLP) with broad applications across various real-world domains. While Arabic is a globally significant language with several well-developed NLP models for its standard form, achieving high performance in sentiment analysis for the Saudi Dialect (SD) remains challenging. A key factor contributing to this difficulty is inadequate SD datasets for training of NLP models. This study introduces a novel method for adapting a high-resource language model to a closely related but low-resource dialect by combining moderate effort in SD data collection with generative AI to address this problem of inadequacy in SD datasets. Then, AraBERT was fine-tuned using a combination of collected SD data and additional SD data generated by GPT. The results demonstrate a significant improvement in SD sentiment analysis performance compared to the AraBERT model, which is fine-tuned with only collected SD datasets. This approach highlights an efficient approach to generating high-quality datasets for fine-tuning a model trained on a high-resource language to perform well in a low-resource dialect. Leveraging generative AI enables reduced effort in data collection, making our approach a promising avenue for future research in low-resource NLP tasks.
    42 0
  • ItemRestricted
    Navigating Arabic Sentiments: An Evaluation of Multilingual and Arabic Dedicated Large Language Models
    (University of Exeter, 2024) Altowairqi, Hadeel; Menezes, Ronaldo
    Expressing emotions in written text, especially in Arabic with its complex structure and poetic elements, can be challenging.While body language enriches spoken communication with emotional depth, written Arabic often lacks this nuance. The advent of Large Language Models (LLMs) has revolutionized natural language processing (NLP), excelling in tasks like text generation and sentiment analysis. However, the performance of these models varies significantly depending on the language and task. Arabic poses unique challenges due to its complex morphology and diverse dialects. This research investigates the impact of LLMs, particularly those tailored for Arabic, on the emotional depth of the written text. By evaluating how these models modify expressions, the study aims to understand whether LLMs preserve or constrain the intricate emotional nuances inherent in Arabic. The findings will contribute to the development of more effective AI tools for digital communication in the Arabic-speaking world, enhancing applications in fields such as sentiment analysis, opinion mining, and content moderation. Through a comprehensive analysis of over 81,000 Arabic texts, including tweets and book reviews, the study examines the performance of the general-purpose LLM ChatGPT and the Arabic-specific LLM JAIS, focusing on the sentiment shifts introduced by their edits. The results reveal a significant tendency of these models to introduce a positive bias, reducing the frequency of extremely negative sentiments. These insights highlight the necessity of incorporating cultural and linguistic nuances into LLM training data, emphasizing the importance of responsible development and ethical considerations in LLM applications.
    20 0
  • ItemRestricted
    A Quality Model to Assess Airport Services Using Machine Learning and Natural Language Processing
    (Cranfield University, 2024-04) Homaid, Mohammed; Moulitsas, Irene
    In the dynamic environment of passenger experiences, precisely evaluating passenger satisfaction remains crucial. This thesis is dedicated to the analysis of Airport Service Quality (ASQ) by analysing passenger reviews through sentiment analysis. The research aims to investigate and propose a novel model for assessing ASQ through the application of Machine Learning (ML) and Natural Language Processing (NLP) techniques. It utilises a comprehensive dataset sourced from Skytrax, incorporating both text reviews and numerical ratings. The initial analysis presents challenges for traditional and general NLP techniques when applied to specific domains, such as ASQ, due to limitations like general lexicon dictionaries and pre-compiled stopword lists. To overcome these challenges, a domain-specific sentiment lexicon for airport service reviews is created using the Pointwise Mutual Information (PMI) scoring method. This approach involved replacing the default VADER sentiment scores with those derived from the newly developed lexicon. The outcomes demonstrate that this specialised lexicon for the airport review domain substantially exceeds the benchmarks, delivering consistent and significant enhancements. Moreover, six unique methods for identifying stopwords within the Skytrax review dataset are developed. The research reveals that employing dynamic methods for stopword removal markedly improves the performance of sentiment classification. Deep learning (DL), especially using transformer models, has revolutionised the processing of textual data, achieving unprecedented success. Therefore, novel models are developed through the meticulous development and fine-tuning of advanced deep learning models, specifically Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional Encoder Representations from Transformers (BERT), tailored for the airport services domain. The results demonstrate superior performance, highlighting the BERT model's exceptional ability to seamlessly blend textual and numerical data. This progress marks a significant improvement upon the current state-of-the-art achievements documented in the existing literature. To encapsulate, this thesis presents a thorough exploration of sentiment analysis, ML and DL methodologies, establishing a framework for the enhancement of ASQ evaluation through detailed analysis of passenger feedback.
    28 0
  • Thumbnail Image
    ItemRestricted
    Exploring the Impact of Sentiment Analysis on Price Prediction
    (Lehigh University, 2024-07) Zahhar, Abdulkarim Ali Y.; Robinson, Daniel P.
    The integration of sentiment analysis into predictive models for financial markets, particularly Bitcoin, combines behavioral finance with quantitative analysis. This thesis investigates the extent to which sentiment data, derived from social media platforms such as X (formerly Twitter), can enhance the accuracy of Bitcoin price predictions. A key idea in the study is that public sentiment, as shown on social media, affects Bitcoin’s market prices. The research uses linear regression models that combine Bitcoin’s opening prices with sentiment scores from social media to forecast closing prices. The analysis covers the period from January 2012 to December 2019. Sentiment scores were analyzed using VADER and TextBlob lexicons. The empirical findings show that models incorporating sentiment scores enhance predictive accuracy. For example, incorporating daily average sentiment scores (v avg and B avg) into the models reduced the Mean Squared Error (MSE) from 81184 to 81129 and improved other metrics such as Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), particularly at specific lag times like 8 and 76 days. These results emphasize the potential benefits of sentiment analysis to improve financial forecasting models. However, it also acknowledges limitations related to the scope of data and the complexities of accurately measuring sentiment. Future research is encouraged to explore more sophisticated models and diverse data sources to further enhance and validate the integration of sentiment analysis in financial forecasting.
    101 0
  • Thumbnail Image
    ItemRestricted
    Exploring Emoji Sentiment Roles in Arabic Textual Content on Digital Social Networks
    (Saudi Digital Library, 2024-07-09) Hakami, Shatha Ali A; Hendley, Robert; Smith, Phillip
    In today’s digital landscape, emoji have risen as pivotal elements in articulating sentiment, especially within the intricacies of the Arabic language. This thesis examines the various roles that emoji can play in expressing sentiment in Arabic texts, highlighting their relevance both in academic and real-world contexts. Beginning with foundational insights, our investigation retraces the history of emoji as important non-verbal communicative tools in human interaction. Then, we explore the distinct challenges of sentiment analysis in Arabic and refer to a thorough review of previous studies to frame our method, identifying both established techniques and unexplored opportunities. At the heart of our research is the understanding that, depending on the context, an emoji can adopt a wide variety of sentiment roles. These range from acting as an indicator, mitigator, emphasizer, reverser, releaser, or trigger of either negative or positive sentiment. Additionally, there are instances where an emoji simply maintains a neutral effect on the sentiment of the accompanying text. To achieve this, we gathered a large dataset, mainly from Twitter, and developed lexicons of words and emoji tailored for sentiment analysis in Arabic. These lexicons were the basis of our analysis model. By leveraging the insights gained from the emoji-roles sentiment lexicon and combining them with our established knowledge of the sentiment roles associated with specific emoji patterns, we make a significant improvement in the conventional sentiment classifier based on the emoji lexicon. Traditional methods often assign a static sentiment score to an emoji, failing to consider its varying roles in different textual contexts. Our refined approach corrects this oversight. Instead of considering a singular unchanging sentiment score for each emoji, the classifier dynamically retrieves sentiment scores based on the specific role the emoji plays within a given sentence. In conclusion, we compare our method with other Arabic sentiment analysis tools, demonstrating the value of our approach, especially within nuanced linguistic phenomena such as sarcasm and humour. This thesis sets the foundation for future Arabic research in this expanding domain.
    68 0

Copyright owned by the Saudi Digital Library (SDL) © 2026