SACM - United States of America
Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9668
Browse
4 results
Search Results
Item Restricted Cross Dataset Fairness Evaluation of Transformer Based Sentiment Models(Saudi Digital Library, 2025-05-10) Zuiran, Sara; Bhattacharyya, SiddharthaWith the growing exploration of Natural Language Processing (NLP) systems in decision-making environments, it is essential to evaluate technical and ethical aspects of the dataset and the NLP model to improve fairness. To assess fairness, the thesis examines demographic imbalances in sentiment classification models by evaluating transformer-based models fine-tuned on the Stanford Sentiment Treebank version 2 dataset (SST-2) against the demographically annotated Comprehensive Assessment of Language Model dataset (CALM). This work identifies performance disparities in sentiment prediction across demographic groups by examining sensitive attributes such as gender and race. The study evaluates both the RoBERTa and MentalBERT transformer models using a complete set of fairness metrics consisting of Statistical Parity Difference (SPD), Equal Opportunity Difference (EOD), False Positive Rates (FPR), False Negative Rates (FNR), Jensen-Shannon Divergence (JSD), and Wasserstein Distance (WD). The analysis examines both group-vs-rest and pairwise subgroup comparisons, including gender and ethnicity. Results show that applying adversarial mitigation reduced fairness disparities across demographic subgroups, with the most notable improvements observed for non-binary and Asian users. The observed disparities emphasize the challenge of reducing performance gaps across demographic subgroups in sentiment classification tasks. The thesis introduces a practical framework for evaluating demographic dis- disparities, extends fairness analysis, and assesses the impact of mitigation techniques in cross-dataset sentiment classification. This research proposes a framework that demonstrates a path toward creating inclusive NLP systems and establishes the groundwork for upcoming ethical Artificial Intelligence (AI) studies.13 0Item Restricted Developing a Generative AI Model to Enhance Sentiment Analysis for the Saudi Dialect(Texas Tech University, 2024-12) Aftan, Sulaiman; Zhuang, YuSentiment Analysis (SA) is a fundamental task in Natural Language Processing (NLP) with broad applications across various real-world domains. While Arabic is a globally significant language with several well-developed NLP models for its standard form, achieving high performance in sentiment analysis for the Saudi Dialect (SD) remains challenging. A key factor contributing to this difficulty is inadequate SD datasets for training of NLP models. This study introduces a novel method for adapting a high-resource language model to a closely related but low-resource dialect by combining moderate effort in SD data collection with generative AI to address this problem of inadequacy in SD datasets. Then, AraBERT was fine-tuned using a combination of collected SD data and additional SD data generated by GPT. The results demonstrate a significant improvement in SD sentiment analysis performance compared to the AraBERT model, which is fine-tuned with only collected SD datasets. This approach highlights an efficient approach to generating high-quality datasets for fine-tuning a model trained on a high-resource language to perform well in a low-resource dialect. Leveraging generative AI enables reduced effort in data collection, making our approach a promising avenue for future research in low-resource NLP tasks.41 0Item Restricted Exploring the Impact of Sentiment Analysis on Price Prediction(Lehigh University, 2024-07) Zahhar, Abdulkarim Ali Y.; Robinson, Daniel P.The integration of sentiment analysis into predictive models for financial markets, particularly Bitcoin, combines behavioral finance with quantitative analysis. This thesis investigates the extent to which sentiment data, derived from social media platforms such as X (formerly Twitter), can enhance the accuracy of Bitcoin price predictions. A key idea in the study is that public sentiment, as shown on social media, affects Bitcoin’s market prices. The research uses linear regression models that combine Bitcoin’s opening prices with sentiment scores from social media to forecast closing prices. The analysis covers the period from January 2012 to December 2019. Sentiment scores were analyzed using VADER and TextBlob lexicons. The empirical findings show that models incorporating sentiment scores enhance predictive accuracy. For example, incorporating daily average sentiment scores (v avg and B avg) into the models reduced the Mean Squared Error (MSE) from 81184 to 81129 and improved other metrics such as Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), particularly at specific lag times like 8 and 76 days. These results emphasize the potential benefits of sentiment analysis to improve financial forecasting models. However, it also acknowledges limitations related to the scope of data and the complexities of accurately measuring sentiment. Future research is encouraged to explore more sophisticated models and diverse data sources to further enhance and validate the integration of sentiment analysis in financial forecasting.95 0Item Restricted Investigating The Use Of Social Media In Relation To Cognitive Disabilities From The Arab Region(2023) Alshenaifi, Reem; Feng, Jinjuan; Nguyen, NamThis dissertation reports studies on social media usage in relation to cognitive disabilities from the Arab region. The first study investigated how social media is used in supporting and empowering Saudi caregivers of children with cognitive disabilities. Through interviews with 13 caregivers, we examined their motivations and concerns as well as the role of social media during the COVID-19 pandemic. The results suggest that caregivers used social media with caution to seek information and emotional support, to spread awareness, and to communicate and build communities. The findings also suggest that caregivers face a great deal of challenges in security and privacy, social stigma and negative discussions, misinformation, as well as lack of resources. In the second study, we utilized text mining approaches, including sentiment analysis and topic modeling, to examine and understand how Arab users engage with Twitter to discuss cognitive disabilities. Content volume, temporal evolution, users, sentiment, topic discussed were iv analyzed. We applied Valence Aware Dictionary and sEntiment Reasoner (VADER) for sentiment analysis to identify the overall opinions and attitudes toward the researched neurological conditions. We also applied Latent Dirichlet Allocation (LDA) for topic modeling to discover frequent topics in the collected dataset. Additionally, Gephi was used to map the interaction between users discussing cognitive disabilities on Twitter. The results provide new insights into public perspectives, which may assist interested entities to construct and distribute appropriate resources and information. In this dissertation, we presented the analysis techniques, discussed the findings, provided recommendations to interested stakeholders, and introduced potential opportunities and future directions.71 0