Saudi Cultural Missions Theses & Dissertations
Permanent URI for this communityhttps://drepo.sdl.edu.sa/handle/20.500.14154/10
Browse
73 results
Search Results
Item Restricted Automatic Detection and Verification System for Arabic Rumor News on Twitter(University of Technology Sydney, 2026-04) Karali, Sami; Chin-Teng, LinLanguage models have been extensively studied and applied in various fields in recent years. However, the majority of the language use models are designed for and perform significantly better in English compared to other languages, such as Arabic. The differences between English and Arabic in terms of grammar, writing, and word-forming structures pose significant challenges in applying English-based language models to Arabic content. Therefore, there is a critical need to develop and refine models and methodologies that can effectively process Arabic content. This research aims to address the gaps in Arabic language models by developing innovative machine learning (ML) and natural language processing (NLP) methodologies. We apply the developed model to Arabic rumor detection on Twitter to test its effectiveness. To achieve this, the research is divided into three fundamental phases: 1) Efficiently collecting and pre-processing a comprehensive dataset of Arabic news tweets; 2) The refinement of ML models through an enhanced Convolutional Neural Network (ECNN) equipped with N-gram feature maps for accurate rumor identification; 3) The augmentation of decision-making precision in rumor verification via sophisticated ensemble learning techniques. In the first phase, the research meticulously develops a methodology for the collection and pre-processing of Arabic news tweets, aiming to establish a dataset optimized for rumor detection analysis. Leveraging a blend of automated and manual processes, the research navigates the intricacies of the Arabic language, enhancing the dataset’s quality for ML applications. This foundational phase ensures removing irrelevant data and normalizing text, setting a precedent for accuracy in subsequent detection tasks. The second phase is to develop an Enhanced Convolutional Neural Network (ECNN) model, which incorporates N-gram feature maps for a deeper linguistic analysis of tweets. This innovative ECNN model, designed specifically for the Arabic language, marks a significant departure from traditional rumor detection models by harnessing the power of spatial feature extraction alongside the contextual insights provided by N-gram analysis. Empirical results underscore the ECNN model’s superior performance, demonstrating a marked improvement in detecting and classifying rumors with heightened accuracy and efficiency. The culmination of the study explores the efficacy of ensemble learning methods in enhancing the robustness and accuracy of rumor detection systems. By synergizing the ECNN model with Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), and Gated Recurrent Unit (GRU) networks within a stacked ensemble framework, the research pioneers a composite approach that significantly outstrips the capabilities of singular models. This innovation results in a state-of-the-art system for rumor verification that outperforms accuracy in identifying rumors, as demonstrated by empirical testing and analysis. This research contributes to bridging the gap between English-centric language models and Arabic language processing, demonstrating the importance of tailored approaches for different languages in the field of ML and NLP. These contributions signify a monumental step forward in the field of Arabic NLP and ML and offer practical solutions for the real-world challenge of rumor proliferation on social media platforms, ultimately fostering a more reliable digital environment for Arabic-speaking communities.12 0Item Restricted A Peer-to-Peer Federated Learning Framework for Intrusion Detection in Autonomous Vehicles(Lancaster University, 2024-09) Alotaibi, Bassam; Bradbury, MatthewAs autonomous vehicles (AVs) increasingly rely on interconnected systems for enhanced functionality, they also face heightened cyberattack vulnerability. This study introduces a decentralized peer-to-peer federated learning framework to improve intrusion detection in AV environments while preserving data privacy. A novel soft-reordering one-dimensional Convolutional Neural Network (SR-1CNN) is proposed as the detection engine, capable of identifying known and unknown threats with high accuracy. The framework allows vehicles to communicate directly in a mesh topology, sharing model parameters asynchronously, thus eliminating dependency on centralized servers and mitigating single points of failure. The SR-1CNN model was tested on two datasets: NSL-KDD and Car Hacking, under both independent and non-independent data distribution scenarios. The results demonstrate the model’s robustness, achieving detection accuracies of 94.39% on the NSL-KDD dataset and 99.97% on the Car Hacking dataset in independent settings while maintaining strong performance in non-independent configurations. These findings underline the framework’s potential to enhance cybersecurity in AV networks by addressing data heterogeneity and preserving user privacy. This research contributes to the field of AV security by offering a scalable, privacy-conscious intrusion detection solution. Future work will focus on optimizing the SR-1CNN architecture, exploring vertical federated learning approaches, and validating the framework in real-world autonomous vehicle environments to ensure its practical applicability and scalability.19 0Item Restricted Negative Mixture Models via Squaring Representation and Learning(University of Nottingham, 2024) Almhmadi, Samaher; Raykov, YordanThe truths behind a real-world data can be faced by measuring the uncertainty around data. From probabilistic view, the uncertainty is used with respected to unsupervised learning as learning objectives under the probability distributions and inference. Mixture models enhanced the expressiveness of probability distributions. Mixture models have provided a general framework used for clustering data by building more complex probability distributions. We are begging with discussion of mixture distributions and introduced the latent variable concept. Mixture types with respect to the number of components and its formulation are discussed. Some example of Gaussian mixture models is exposed. Mixture types with respect to mixture coefficient are also discussed. We exposed the statistical inference problem of mixture models with different approaches such as, latent variable models, Markov chain Mote Carlo method and variational methods. Through our discussion, we exposed a several illustrative examples. Some concepts of probabilistic circuits: representation, formulation and the corresponding inference are also discussed. In thesis, we applied probabilistic circuits in probabilistic inference. Also, we discussed how the negative mixture is presented as probabilistic circuits. And its structure as tractable computational graphs. Also, we discussed the representation for the squared negative mixture models as efficiently tensorized computational graphs. As well as how can reduces the model size under including negative parameters in this class of functions. Mixture models and especially negative mixture model via squaring to learn the truths of real data was discussed. Due to Gaussian mixture models applied in several branches of science such as machine learning, data mining, pattern recognition and statistical analysis. And Gaussian mixture model and negative Gaussian mixture model are an important subclass for learning in data. In this thesis, we focused on discussion these models in two cases positive and negative case. For the representing the valid negative mixture models, we discuss a generic strategy to support negative parameters called squaring a base mixture. And then, this framework is extending to probabilistic circuits. Finally, we discuss the main idea of my thesis The main aim of this thesis is discussion the inference problem in the framework of mixture models. As well as the basic role which play each of positive mixture model and negative weight mixture model, especially standard Gaussian mixture model and negative weight Gaussian mixture model in inference problem. we expose this thesis in five subsequent chapters describe as follows. In Chapter 1: We discuss mixture motivation and mixture types. Also, we expose to some standard mixture models. In Chapter 2: We discuss mixture types with respected to its coefficients. When mixture coefficient is reduced to negative values for some not all coefficients then mixture model called negative weight mixture model. Also, in this chapter expos to the statistical inference problem of mixture models with different approaches such as latent variable models, Markov chain Mote Carlo (MCMC) method and variational methods. In Chapter 3: We discuss the important ideas around the problem of probabilistic inference. Information about the class of queries to computing interesting quantities of a probability distribution are discussed and makes a family of probabilistic model tractable. Different illustrative examples are exposed. The probabilistic circuits: representation and inference were discussed. At the end of this chapter discussed negative MMs via squaring and representing negative MMs as probabilistic circuits. In Chapter 4: We discuss Gaussian mixture models used to present subpopulations within an overall population. Also, we have known how Gaussian mixtures which is constituted a form of unsupervised learning. In the second part, we discussed the negative weight Gaussian mixture models under negative coefficients which make it more expressive than Gaussian mixture models by reducing the number of components and parameters. Also, the comparison between standard Gaussian mixture model and negative weight Gaussian mixture model are formulated under a real example. In Chapter 5: We discuss the important contributions of positive and negative weight mixture models especially positive and negative weight Gaussian mixture models. As well as the future works which can be developed in mixture framework.49 0Item Restricted Utilizing Data Analytics for Fraud Detection and Prevention in Online Banking Systems of Saudi Arabia(University of Portsmouth, 2024-09) Almotairy, Yazeed; Jiacheng, TanThis thesis addresses the critical issues of online banking and online banking fraud in Saudi Arabia. The thesis focusses on the older methodologies of the online banking systems in Saudi Arabia. The frauds are discussed in detail that are occurring in the online banking systems and are causing inconvenience to the users and account holders of the online banks and applications. In this thesis, online banking frauds are discussed thoroughly, and the traditional fraud detection methods are elaborated as well. The vulnerabilities in the current systems are explored. It discusses how the older systems are not performing well and why the new system encompasses the power of data analytics and machine learning. The methods proposed use a set of data analytics and machine learning algorithms and techniques to detect fraud or any fraudulent activity that a scammer or fraudster may perform. The results of this study explain how the proposed system can outperform the traditional methodologies being used in Saudi Arabian online banking systems. The proposed system can also enhance the user experience. The possible privacy and ethical concerns are also discussed. In the end, it is also discussed what the future prospects are for the researchers who are looking to enhance this research or want to work in the field of data analytics and machine learning to improve the security of the security of online banking applications. In conclusion, this thesis not only contributes to the body of knowledge on online banking frauds in Saudi Arabia and their detection but also features future research topics for new researchers.7 0Item Restricted Detecting LLM Generated Phishing Emails Using Machine Learning: A Multi-Classification Approach And A Comprehensive Evaluation(University of Birmingham, 2024-09) Alharthi, Alanoud; Andriotis, PanagiotisPhishing is a significant cybersecurity threat that targets organisations as well as individuals. The aim of this project is to provide a comprehensive machine learning model that can accurately detect LLM generated phishing with high accuracy from a dataset of four different classes of emails: LLM phishing, LLM non-phishing, Human phishing and Human non-phishing. This balanced and diverse dataset of 4000 emails acts as a real-world representation of the different types of emails that are sent daily that include different distinct features, allowing for an accurate feature differentiation from the classes of the dataset. The five machine learning algorithms that were used for this research are: Decision Tree, Support Vector Machine (SVM), Random Forest, Gradient Boost and K-Nearest Neighbours (KNN). These algorithms were tuned to evaluate the performance of the models after hyperparameter tuning. The highest accuracy achieved from the model before tuning was the SVM with an accuracy of 97.3%. The subsequent highly accurate models are Random Forest of 96.9%, KNN of 96.8% and Gradient Boosting of 96.7%. The model that achieved the lowest accuracy was Decision Tree, achieving an accuracy of 90.7%. Hyperparameter tuning was applied to models and the performance was re-evaluated to investigate if hyperparameter tuning enhanced the performance of the models. Other metrics such as precision, recall and F1-score were also measured. The developed and trained models were then integrated with a web page developed using streamlit for a user-friendly interface for the classifications of the emails. Overall, this research aims to provide a framework for detecting LLM phishing emails. The results of this research signify that with the correct methodologies, we can enhance the detection of LLM generated phishing, contributing to robust defences against emerging cyber threats.15 0Item Restricted An Ontology-based Framework for the Modelling and Online Detection of Obsessive Compulsive Disorder(Cardiff University, 2024-11) muhajab, Areej; Abdelmoty, AliaIn the contemporary digital landscape, the prevalence and impact of Obsessive- Com- pulsive Disorder (OCD) discourse in online platforms have garnered increasing signif- icance. This thesis presents an integrated framework aimed at detecting and classi- fying OCD in online discourse by harnessing the synergy between ontology develop- ment and machine learning. The primary objective is to enhance the understanding and identification of OCD-related content within the vast and varied landscape of on- line forums. The research begins with the construction of a comprehensive ontology, named OCD, specifically designed to encapsulate the multifaceted aspects of OCD. This ontology is developed to represent the complex interplay of OCD symptoms, behaviors, and related mental health concepts. Drawing upon insights from medical literature, psy- chological studies, and existing biomedical ontologies, the OCD ontology provides a structured, hierarchical representation of OCD, enabling systematic identification and categorisation of OCD-related terms. Consequently, it furnishes a rich semantic framework that facilitates accurate interpretation of online discourse. In addition to ontology development, the thesis explores machine learning method- ologies, particularly focusing on the classification of OCD-related posts on online plat- form. A variety of classification models are employed to analyse and categorise online content. Leveraging the OCD ontology as a foundational reference for feature extrac- tion and semantic analysis, these models are trained and evaluated on a corpus of OCD forum posts. The classification process is designed to discern various OCD manifestations, such as obsessions and compulsions, thereby offering a granular un- derstanding of the disorder’s portrayal in digital communication. The outcomes of this thesis carry significant implications for mental health profes- sionals, online community moderators, and researchers. The developed framework and methodologies represent a pioneering tool for monitoring, understanding, and addressing OCD in the digital space.15 0Item Restricted Toward a Better Understanding of Accessibility Adoption: Developer Perceptions and Challenges(University Of North Texas, 2024-12) Alghamdi, Asmaa Mansour; Stephanie, LudiThe primary aim of this dissertation is to explore the challenges developers face in interpreting and implementing accessibility in web applications. We analyze developers’ discussions on web accessibility to gain a comprehensive understanding of the challenges, misconceptions, and best practices prevalent within the development community. As part of this analysis, we built a taxonomy of accessibility aspects discussed by developers on Stack Overflow, identifying recurring trends, common obstacles, and the types of disabilities associated with the features addressed by developers in their posts. This dissertation also evaluates the extent to which developers on online platforms engage with and deliberate upon accessibility issues, assessing their awareness and implementation of accessibility standards throughout the web application development process. Given the volume and variety of these discussions, manual analysis alone would be insufficient to capture the full scope of accessibility challenges. Therefore, we employed supervised machine learning techniques to classify these posts based on their relevance to different aspects of the WCAG 2.2 guidelines principle. By training our models on labeled data, we were able to automatically detect patterns and keywords that indicate specific accessibility issues, even when the language used by developers is not directly aligned with the official guidelines. The results emphasize developers’ struggles with complex accessibility issues, such as time-based media customization and screen reader configuration. The findings indicate that machine learning holds significant potential for enhancing compliance with accessibility standards, providing a pathway for more efficient and accurate adherence to these guidelines.50 0Item Restricted Enhancing DDoS attack Detection using Machine Learning and Deep Learning Models(University of Warwick, 2023-09-26) AlObaidan, Fatimah; Raza, HassanTechnology has become an essential part of our daily lives, indispensable for both individuals and enterprises. It facilitates the exchange of an extensive range of information across different spaces. However, Internet security is a critical challenge in today's digital age with growing dependence on IT services. Thus, various network environments can be vulnerable to attacks, causing resource depletion and hindering support for legitimate users. One of these attacks is the Distributed Denial of Service (DDoS) attack. The nature of this type of attack is such that it impacts the availability of the system. The impact to confidentiality is primary due to threat actors using DDoS as method to create chaos whilst lunching cyber-attacks on other part of infrastructures. Therefore, it is essential that DDoS attacks required sharper focus from a research perspective. The network intrusion detection system (NIDSs) are important tool to detect and monitor the network environment from DDoS attacks. However, NIDS tools suffer from several limitation such as detecting new attack and misclassified attacks. Therefore, Machine Learning (ML) and Deep Learning (DL) models are increasingly being used for automated detection of DDoS attacks. While several related works deployed ML for NIDS, most of these approaches ignore the appropriate pre-processing and overfitting problem during the implementation of ML algorithms. As a result, it can impact the robustness of the anomaly detection system and lead to poor model performance for zero-day attacks. In this research study, the researcher is proposing a new ML and DL approach based on hybrid feature selection and appropriate pre-processing operation to classify the network flow into normal or DDoS attacks. The results of the experiments carried out by researcher suggest the efficiency and the reliability of the proposed lightweight models in achieving high detection rate while minimising the detection time with less number of features. This project complies with following two CyBOK Skills areas: Network Security: The project evaluates the network security and introduces efficient, lightweight models for DDoS attack detection. Security Operations and Incident Management: The project enhances incident management capabilities by crafting ML that monitors network flows within NIDS.8 0Item Restricted Online conversations: A study of their toxicity(University of Illinois Urbana-Champaign, 2024) Alkhabaz, Ridha; Sundaram, HariSocial media platforms are essential spaces for modern human communication. There is a dire need to make these spaces most welcoming and engaging to their participants. A potential threat to this need is the propagation of toxic content in online spaces. Hence, it becomes crucial for social media platforms to detect early signs of a toxic conversation. In this work, we tackle the problem of toxicity prediction by proposing a definition for conversational structures. This definition empowers us to provide a new framework for toxicity prediction. Thus, we examine more than 1.18 million X (made by 4.4 million users), formerly known as Twitter, threads to provide a few key insights about the current state of online conversations. Our results indicated that most of the X threads do not exhibit a conversational structure. Also, our newly defined structures are distributed differently than previously thought of online conversations. Additionally, our definitions give a meaningful sign for models to start predicting the future toxicity of online conversations. We also showcase that message-passing graph neural networks outperform state-of-the-art gradient- boosting trees for toxicity prediction. Most importantly, we find that once we observe the first two terminating conversational structures, we can predict the future toxicity of online threads with ≈88 % accuracy. We hope our findings will help social media platforms better curate content in their spaces and promote more conversations in online spaces.19 0Item Restricted A Quality Model to Assess Airport Services Using Machine Learning and Natural Language Processing(Cranfield University, 2024-04) Homaid, Mohammed; Moulitsas, IreneIn the dynamic environment of passenger experiences, precisely evaluating passenger satisfaction remains crucial. This thesis is dedicated to the analysis of Airport Service Quality (ASQ) by analysing passenger reviews through sentiment analysis. The research aims to investigate and propose a novel model for assessing ASQ through the application of Machine Learning (ML) and Natural Language Processing (NLP) techniques. It utilises a comprehensive dataset sourced from Skytrax, incorporating both text reviews and numerical ratings. The initial analysis presents challenges for traditional and general NLP techniques when applied to specific domains, such as ASQ, due to limitations like general lexicon dictionaries and pre-compiled stopword lists. To overcome these challenges, a domain-specific sentiment lexicon for airport service reviews is created using the Pointwise Mutual Information (PMI) scoring method. This approach involved replacing the default VADER sentiment scores with those derived from the newly developed lexicon. The outcomes demonstrate that this specialised lexicon for the airport review domain substantially exceeds the benchmarks, delivering consistent and significant enhancements. Moreover, six unique methods for identifying stopwords within the Skytrax review dataset are developed. The research reveals that employing dynamic methods for stopword removal markedly improves the performance of sentiment classification. Deep learning (DL), especially using transformer models, has revolutionised the processing of textual data, achieving unprecedented success. Therefore, novel models are developed through the meticulous development and fine-tuning of advanced deep learning models, specifically Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional Encoder Representations from Transformers (BERT), tailored for the airport services domain. The results demonstrate superior performance, highlighting the BERT model's exceptional ability to seamlessly blend textual and numerical data. This progress marks a significant improvement upon the current state-of-the-art achievements documented in the existing literature. To encapsulate, this thesis presents a thorough exploration of sentiment analysis, ML and DL methodologies, establishing a framework for the enhancement of ASQ evaluation through detailed analysis of passenger feedback.10 0