Saudi Cultural Missions Theses & Dissertations

Permanent URI for this communityhttps://drepo.sdl.edu.sa/handle/20.500.14154/10

Browse

Search Results

Now showing 1 - 10 of 52
  • ItemRestricted
    Graph Neural Networks for Drug Screening
    (Saudi Digital Library, 2025) Aqeeli, Noura Eissa; Panas, Daga
    Drug discovery is a lengthy and costly process that often involves small, noisy, and imbalanced datasets. In our study, we investigate the use of graph neural networks (GNNs) for predicting molecular homeostatic activity in neuronal cells through transfer learning. We evaluate Graph Convolutional Networks (GCNs) and Message Passing Neural Networks (MPNNs) with transfer learning, comparing their performance to Random Forest and non-transfer GNN baselines. To guide the selection of source datasets for pre-training, we implement a molecular latent representation similarity framework across nine MoleculeNet datasets. Additionally, we fine-tune a foundational molecular model on our target dataset. We evaluate the models using five-fold cross-validation, using the Area Under the Receiver Operating Characteristic curve (AUC-ROC) and the Area Under the Precision-Recall curve (AUC-PR) as metrics. Our results indicate that transferring knowledge from high-similarity source datasets outperforms the baseline models. Moreover, source-to-target transfer is more effective than fine-tuning the foundation model; however, the foundation model exhibits superior generalisation capabilities. Finally, we employ a selected set of models to rank an unlabelled molecular dataset. Our findings demonstrate that GNNs, combined with similarity-guided transfer learning, enhance performance in predicting bioactivity within low-data and imbalanced settings, highlighting the importance of carefully selecting source datasets to avoid negative transfer.
    7 0
  • ItemRestricted
    Multi-Omics Approaches to Explore Vancomycin Treatment Mechanism in Patients with Primary Sclerosing Cholangitis (PSC) - Inflammatory Bowel Disease (IBD)
    (Saudi Digital Library, 2025) AlOmar, Haneen; Acharjee, Animesh
    Introduction: Primary sclerosing cholangitis (PSC) is a comorbid condition associated with inflammatory bowel disease (PSC-IBD) that lacks effective treatments beyond liver transplantation. Although oral vancomycin (OV) has shown therapeutic promise, disease activity often returns after treatment withdrawal. This study aims to investigate the mechanisms of OV in PSC-IBD patients, supporting the development of more durable and targeted therapies. Method: Paired multi-omics data from 15 patients before and after OV treatment were analysed. The datasets included RNA-Seq, metatranscriptomics, bile acid metabolites, and 16S rRNA. After preprocessing, feature selection was performed using LASSO, ElasticNet, and Boruta-RF. Selected features were analysed in two complementary ways: first, intersected features that were identified by all models were assessed for their predictive robustness and integrated into correlation network graphs. Union features were subjected to pathway enrichment analysis to elucidate their biological significance. Results: The 3 models consistently selected a total of 13, 2, 4, and 3 intersected features simultaneously from RNA-Seq, metatranscriptomics, bile acid metabolites, and 16S rRNA, respectively. These features achieved predictive performance comparable to or superior to the full datasets. For example, intersected features outperformed the full dataset in metatranscriptomics, where Boruta-RF achieved a higher AUC (0.936 vs. 0.896), demonstrating the robustness and efficiency of selected features. Pathway enrichment analysis of union features in each omics revealed pathways related to mucosal healing, metabolism, and immune modulation. Correlation networks graphs demonstrated that OV-induced alterations in cross-omics before and after treatment. Conclusion: Based on paired data from only 15 patients, this study provided a comprehensive multi-omics perspective on OV’s impact in PSC-IBD patients and identified robust biomarkers. We also uncovered novel host–microbiome interactions not previously reported, highlighting potential targets for future therapies. While findings are promising, they require validation in larger, independent cohorts.
    6 0
  • ItemRestricted
    Predicting Osteoarthritis in Older Adults Using Literature-Based, Non-Invasive Risk Factors: A Cross-Sectional Analysis of ELSA Wave 9
    (Saudi Digital Library, 2025) Fnais, Tesneem; Yang, Hui
    Osteoarthritis (OA) is a prevalent joint disorder in older adults that is often diagnosed at a later stage, as clinical assessments typically rely on imaging and laboratory tests that are not readily accessible in all settings. This study aimed to develop and evaluate machine learning models that predict OA using non-invasive, self-reported features from Wave 9 of the English Longitudinal Study of Ageing (ELSA). A total of 4,723 participants aged 60 and above were included. An initial set of 32 features was selected based on existing literature and refined through a structured feature selection pipeline, resulting in a final set of 25 features, including joint pain and mobility limitations. Four supervised models -Logistic Regression, Random Forest, XGBoost, and CatBoost- were trained using a stratified train-test split and resampling to address class imbalance. The upsampled logistic regression model achieved the highest sensitivity (0.769) and strong overall performance (AUC = 0.755), while CatBoost showed the highest specificity (0.759) and an AUC of 0.747. A reduced logistic regression model using only the top 15 features retained similar accuracy and AUC. These findings demonstrate that OA can be predicted without imaging or biomarkers. The resulting models, particularly the logistic regression model, offer promise as cost-effective screening tools to support early identification and guide decisions about further clinical assessment. making them well-suited for primary care and digital health settings, especially where resources are limited.
    4 0
  • ItemRestricted
    Evaluating Machine Learning for Intrusion Detection in CAN Bus for in-Vehicle Security
    (Saudi Digital Library, 2025) Alfardus, Asma; Rawat, Danda
    The past decade has seen a potential rise in the automobile industry accompanied by some serious challenges and threats. Increased demand for intelligent transportation system facilities has given a boom to the automotive industry. A safer and better experience is much sought from vehicles. It opens opportunities of including autonomous vehicles and Vehicle to Everything technologies in the automotive sector. Enabling vehicles to connect to various services exposes to compromise and misuse by the adversaries. There are numerous electronic devices in the modern vehicle which communicate with each other using multiple standard communication protocols. State-of-the-art vehicles are the assembly of complex mechanical devices with the sophisticated technology of electronic devices and connections to the external world. Controller Area Network (CAN) is one of the widely used protocols for in-vehicle communications. However, the lack of some fundamental security features such as encryption and authentication in CAN makes it vulnerable to security attacks. The backbone of connecting autonomous vehicles is CAN with limited bandwidth and exposure to unauthorized access. Various attacks compromise the confidentiality, integrity, and availability of vehicular data through intrusions which may endanger the physical safety of vehicles and passengers. These security shortcomings, therefore, lead to accidents and financial loss to the users of vehicles. To protect the in-vehicle electronic devices, researchers have proposed several security countermeasures. In this work, we discuss various security vulnerabilities and potential solutions to CAN’s. Further, a machine learning-based approach is also developed to devise an Intrusion Detection System for the CAN bus network. This study aims to explore the adaptability of the proposed intrusion detection system across diverse vehicular architectures and operational conditions. Furthermore, the findings contribute to advancing the state-ofthe-art in automotive cybersecurity, fostering safer and more resilient transportation ecosystems. Moreover, it investigates the scalability of the intrusion detection system to handle the increasing complexity and volume of data generated by modern vehicles.
    20 0
  • ItemRestricted
    Towards Industrially Adoptable Generation Invariant Reprocessable Polydicyclopentadiene Thermoset Plastics
    (Saudi Digital Library, 2025-05) Alfaraj, Yasmeen; Johnson, Jeremiah
    The industrial transition to sustainable polymer technologies necessitates novel end-of-life approaches for historically un-recyclable thermoset plastics. Polydicyclopentadiene (pDCPD), a high-performance thermoset known for its superior mechanical and thermal properties represents a compelling target for sustainability-oriented innovation due to its established industrial use, diverse manufacturing methods, historic challenges in reprocessing, and an increased interest from its relevant industries to recover valuable fillers and reinforcing materials from pDCPD carbon-fiber-reinforced polymers (CFRPs). Recent reports exhibit the ability to deconstruct pDCPD through a cleavable comonomer (CC) approach; however, we currently lack cost-effective strategies for scaling its deconstruction and recycling. This thesis addresses the fundamental barriers to industrial implementation of deconstructable pDCPD thermosets through a comprehensive, three-pronged approach that integrates data-driven molecular design, drop-in strategies for multigenerational recyclability, and cost-informed evaluation of CCs. In the first part of this work, a closed-loop experimental–computational platform is developed to predict glass transition temperatures (Tg) in deconstructable pDCPD networks incorporating bifunctional silyl ether (BSE) CCs and cleavable cross-linkers. Leveraging a curated dataset of 101 compositionally diverse pDCPD-based thermosets, machine learning model ensembling and strong regularization techniques are implemented to mitigate overfitting and quantify predictive uncertainty. Experimental validation of model predictions shows that the resulting models achieve accurate Tg predictions for variable CC and cleavable cross-linker loadings, novel CCs, and previously unseen related classes of strand cleaving cross-linkers. This chapter demonstrated the viability of predictive informatics in navigating the vast chemical and compositional space of deconstructable thermosets. The second segment presents a minimally chemically intensive, drop-in strategy for pDCPD recyclability. Using cleavable BSE comonomers and cross-linkers, networks with up to 20 wt% recycled oligomeric fragments are synthesized and evaluated. These materials exhibit thermomechanical properties and deconstructability that remain invariant across three generations of recycling. Furthermore, the incorporation of a cleavable cross-linker, dimethyl di-dicyclopentadiene silyl ether (DDMS), not only preserves but enhances bulk properties such as Tg in virgin and recycled samples, and addresses issues of oligomer incorporation in recycled samples as evidenced by gel fraction analysis. The ability to maintain and tune materials properties without post-processing or structural reformulation underscores the industrial potential of the drop-in CC approach for scalable, circular thermoset manufacturing. The final component of the thesis evaluates MeSi7, a seven-membered BSE CC, as a low-cost, synthetically accessible, and possibly scalable alternative to existing CCs. Thermodynamic polymerization parameters and CC performance under industrial thermoset cure conditions are assessed. We find that high-temperature cure conditions enable sufficient incorporation into the pDCPD network strands for deconstruction with as low as 5 mol% loading of MeSi7. These samples retain Tg values above 100 °C, with a moderate reduction relative to non-deconstructable analogues. Assessment of performance in industrial formulations also shows comparable deconstructability thresholds and modest impact on Tg. Importantly, MeSi7 is projected to cost less than 2% of iPrSi8 based on raw material pricing, offering a highly attractive economic profile for broader market applications. Together, these contributions deliver a framework for the rational design, performance prediction, and techno-economic evaluation of cleavable, recyclable thermosets through a convergence of data science, molecular design, and systems-level engineering considerations.
    9 0
  • ItemRestricted
    SEVERITY GRADING AND EARLY DETECTION OF ALZHEIMER’S DISEASE THROUGH TRANSFER LEARNING
    (Saudi Digital Library, 2025) Alqahtani, Saeed; Zohdy, Mohamed
    Alzheimer’s disease (AD) is a neurological disorder that predominantly affects individuals aged 65 and older. It is one of the primary causes of dementia, and it contributes significantly and progressively to impairing and destroying brain cells. Recently, efforts to mitigate the impact of AD have focused with particular emphasis on early detection through computer aided diagnosis (CAD) tools. This study aims to develop deep learning models for the early detection and classification of AD cases into four categories: non-demented, moderate-demented, mild-demented, and very mild demented. Using Transfer Learning technique (TL), several models were implemented including AlexNet, ResNet-50, GoogleNet (InceptionV3), and SqueezeNet, by leveraging magnetic resonance images (MRI) and applying image augmentation techniques. A total of 12,800 images across the four classifications that were preprocessed to ensure balance and meet the specific requirements of each model. The dataset was split into 80% for training and 20% for testing. AlexNet achieved an average accuracy of 98.05%, GoogleNet (InceptionV3) reached 97.80%, ResNet-50 attained 91.11%, and SqueezeNet 86.37%. The use of transfer learning method addresses data limitations, allowing effective model training without the need for building from scratch, thereby enhancing the potential for early and accurate diagnosis of Alzheimer’s disease [1].
    17 0
  • ItemRestricted
    Improving Sleep Health with Deep Learning: Automated Classification of Sleep Stages and Detection of Sleep Disorders
    (Saudi Digital Library, 2024-07-07) Almutairi, Haifa; Datta, Amitava
    Sleep consumes roughly one-third of a person’s lifetime, and it is characterized by distinct stages within sleep cycle. The sequence of these stages at night provides insights into the quality of sleep. Poor sleep quality can have numerous consequences, including drowsiness, reduced concentration, and fatigue. Beyond sleep quality, an analysis of the sequence of sleep stages can uncover the presence of sleep disorders. This thesis aims to focus on three key research problems related to sleep. Firstly, it focuses on the classification of sleep stages using a combination of signals and deep learning models. Sleep stages are categorized into five distinct stages, namely Wake (W), non-rapid eye movement (NREM) stages comprising N1, N2, and N3, and rapid eye movement (REM) stage. Throughout the duration of sleep, individuals experience multiple cycles of sleep stages. Each cycle contains a standard allocation of each stage. An unbalanced distribution of the stages can indicate the presence of sleep disorders. Previous studies primarily classified sleep stages using a single channel of electroencephalography (EEG) signals. However, incorporating a combination of signals from electromyography (EMG) and electrooculogram (EOG) alongside EEG data provides additional features. These features extracted from muscle activity and eye movements during sleep, thereby enhancing classification accuracy. In this thesis, a robust model called SSNet is proposed to accurately classify sleep stages from a fusion of EEG, EMG, and EOG signals. This model combine convolutional neural networks (CNNs) and long short-term memory (LSTM) networks to extract the salient features from various physiological signals. The CNN architecture extracts spatial features from the input signals, while LSTM architecture captures the temporal features present in signals. This study has obtained encouraging outcomes in the classification of sleep stages through the fusion of physiological signals and deep learning techniques. Secondly, this thesis aim to detect obstructive sleep apnoea (OSA) from electrocardiography (ECG) signals using deep learning methods. Sleep disorder breathing (SDB) is categorized into three different types, which are OSA, central sleep apnoea, and mixed sleep apnoea. OSA is the most common form of SDB that is characterized by repeated interruptions in breathing during sleep, leading to fragmented sleep patterns and various health complications. Previous studies developed feature engineering methods and machine learning models for the detection of OSA. Feature engineering methods involve crafting relevant features to feed into machine learning models. However, feature engineering is time-consuming and requires domain expertise. In contrast, deep learning automatically extracts features from ECG signals for OSA detection, eliminating the need for manual feature engineering methods. In this thesis, three deep learning architectures are proposed, including standalone convolutional neural networks (CNN), CNN with long short-term memory (LSTM), and CNN with gated recurrent unit (GRU). Through rigorous experimentation and evaluation, the combination of CNN and LSTM architecture is the best-performing model for OSA detection. To further enhance the architecture’s performance, the hyperparameters of the CNN and LSTM models were tuned and tested over a large dataset to validate their effectiveness. The third research problem addressed in this thesis is detection of periodic leg movements (PLM) and SDB from NREM stage by using a combination of signals and deep learning models. PLM is characterized by involuntary leg movements during sleep. These movements can disrupt sleep and result in daytime sleepiness with reduced quality of life. Detecting PLM and SDB events during NREM stage allows for quantifying the severity of sleep disorders. Previous studies have focused on the development of signal-based models for detecting PLM or SDB. However, the models lacked the ability to distinguish these events within specific sleep stages. To address this problem, a novel deep learning architecture known as DeepSDBPLM is proposed. This architecture aims to detect PLM and SDB events during the NREM stage. This architecture incorporates novel input features called attention EMDRaw signals and utilizes a Residual Convolutional Neural Network (ResCNN) model. This thesis presents experimental results using publicly available datasets to evaluate the performance of the proposed deep learning models for classification of sleep stages, and detection of sleep disorders. The models were evaluated standard metrics. It includes accuracy, sensitivity, specificity, and F1 score. The empirical results establish the effectiveness of proposed approaches. The models can be a stepping stone towards more advanced techniques.
    24 0
  • ItemRestricted
    Exploring Nonlinear Associations and Interactions of Risk Factors for Breast Cancer Incidence Using Machine Learning Approaches
    (Imperial College London, 2024-08) Alqarni, Lina; Heath Alicia; Berrington, Amy
    BACKGROUND: Breast cancer is influenced by a complex array of risk factors. This study aimed to identify nonlinear associations and interactions between various risk factors and breast cancer incidence using computationally efficient, interpretable methods. METHODS: Data from the Generations Study, a long-term prospective cohort of 104,423 women, were analysed. Risk factors evaluated included demographic, medical, reproductive, hormonal, and lifestyle variables. We compared the performance of traditional Cox proportional hazards models with tree-based methods, including Classification and Regression Trees (CART) and random forests, using the C-statistic. SHapley Additive exPlanations (SHAP) values were extracted to interpret random forest outputs, highlighting key risk factors and interactions. Stability selection was applied to enhance computational efficiency and identify the most stable and important variables. RESULTS: The multivariable Cox model achieved the highest predictive accuracy with C-index of 0.657, slightly outperforming the random forest model (C-index of 0.650). However, the random forest model revealed nonlinear associations and interactions not captured by the Cox model. Age, family history of breast cancer, and benign breast disease were among the most critical factors identified, with complex interactions noted between age, body mass index at entry, and family history with other risk factors such as hormone replacement therapy duration, oral contraceptive duration, and smoking pack-years. Stability selection effectively reduced the number of variables without compromising model performance. CONCLUSIONS: While linear models capture dominant associations, tree-based models like random forests offer additional insights into complex, nonlinear relationships among breast cancer risk factors, highlighting the potential for more personalised screening and prevention strategies
    16 0
  • ItemRestricted
    Exploring Nonlinear Associations and Interactions of Risk Factors for Breast Cancer Incidence Using Machine Learning Approaches
    (Imperial College London, 2024) Alqarni, Lina; Heath, Alicia
    BACKGROUND: Breast cancer is influenced by a complex array of risk factors. This study aimed to identify nonlinear associations and interactions between various risk factors and breast cancer incidence using computationally efficient, interpretable methods. METHODS: Data from the Generations Study, a long-term prospective cohort of 104,423 women, were analysed. Risk factors evaluated included demographic, medical, reproductive, hormonal, and lifestyle variables. We compared the performance of traditional Cox proportional hazards models with tree-based methods, including Classification and Regression Trees (CART) and random forests, using the C-statistic. SHapley Additive exPlanations (SHAP) values were extracted to interpret random forest outputs, highlighting key risk factors and interactions. Stability selection was applied to enhance computational efficiency and identify the most stable and important variables. RESULTS: The multivariable Cox model achieved the highest predictive accuracy with C-index of 0.657, slightly outperforming the random forest model (C-index of 0.650). However, the random forest model revealed nonlinear associations and interactions not captured by the Cox model. Age, family history of breast cancer, and benign breast disease were among the most critical factors identified, with complex interactions noted between age, body mass index at entry, and family history with other risk factors such as hormone replacement therapy duration, oral contraceptive duration, and smoking pack-years. Stability selection effectively reduced the number of variables without compromising model performance. CONCLUSIONS: While linear models capture dominant associations, tree-based models like random forests offer additional insights into complex, nonlinear relationships among breast cancer risk factors, highlighting the potential for more personalised screening and prevention strategies.
    10 0
  • ItemRestricted
    OPTIMIZING INTRUSION DETECTION IN IOT NETWORK ENVIRONMENTS THROUGH DIVERSE DETECTION TECHNIQUES
    (Florida Atlantic University, 2025-03-11) Al Hanif, Abdulelah; Ilyas, Mohammad
    The rapid proliferation of Internet of Things (IoT) environments has revolutionized numerous areas by facilitating connectivity, automation, and efficient data transfer. However, the widespread adoption of these devices poses significant security risks. This is primarily due to insufficient security measures within the devices and inherent weaknesses in several communication network protocols, such as the Message Queuing Telemetry Transport (MQTT) protocol. MQTT is recognized for its lightweight and efficient machine-to-machine communication characteristics in IoT environments. However, this flexibility also makes it susceptible to significant security vulnerabilities that can be exploited. It is necessary to counter and identify these risks and protect IoT network systems by developing effective intrusion detection systems (IDS) to detect attacks with high accuracy. This dissertation addresses these challenges through several vital contributions. The first approach concentrates on improving IoT traffic detection efficiency by utilizing a balanced binary MQTT dataset. This involves effective feature engineering to select the most important features and implementing appropriate machine learning methods to enhance security and identify attacks on MQTT traffic. This includes using various evaluation metrics such as accuracy, precision, recall, F1-score, and ROC-AUC, demonstrating excellent performance in every metric. Moreover, another approach focuses on detecting specific attacks, such as DoS and brute force, through feature engineering to select the most important features. It applies supervised machine learning methods, including Random Forest, Decision Trees, k-Nearest Neighbors, and Xtreme Gradient Boosting, combined with ensemble classifiers such as stacking, voting, and bagging. This results in high detection accuracy, demonstrating its effectiveness in securing IoT networks within MQTT traffic. Additionally, the dissertation presents a real-time IDS for IoT attacks using the voting classifier ensemble technique within the spark framework, employing the real-time IoT 2022 dataset for model training and evaluation to classify network traffic as normal or abnormal. The voting classifier achieves exceptionally high accuracy in real-time, with a rapid detection time, underscoring its efficiency in detecting IoT attacks. Through the analysis of these approaches and their outcomes, the dissertation highlights the significance of employing machine learning techniques and demonstrates how advanced algorithms and metrics can enhance the security and detection efficiency of general IoT network traffic and MQTT protocol network traffic.
    38 0

Copyright owned by the Saudi Digital Library (SDL) © 2025