Saudi Cultural Missions Theses & Dissertations
Permanent URI for this communityhttps://drepo.sdl.edu.sa/handle/20.500.14154/10
Browse
114 results
Search Results
Item Restricted Deep Learning-Based White Blood Cell Classification Through a Free and Accessible Application(Saudi Digital Library, 2025) Alluwaim, Yaseer; Campbell, NeillBackground Microscopy of peripheral blood smears (PBS) continues to play a fundamental role in hematology diagnostics, offering detailed morphological insights that complement automated blood counts. Examination of a stained blood film by a trained technician is among the most frequently performed tests in clinical hematology laboratories. Nevertheless, manual smear analysis is labor-intensive, time-consuming, and prone to considerable variability between observers. These challenges have spurred interest in automated, deep learning-based approaches to enhance efficiency and consistency in blood cell assessment. Methods We designed a convolutional neural network (CNN) using a ResNet-50 backbone, applying standard transfer-learning techniques for white blood cell (WBC) classification. The model was trained on a publicly available dataset of approximately 4,000 annotated peripheral smear images representing eight WBC types. The image processing workflow included automated nucleus detection, normalization, and extensive augmentation (rotation, scaling, etc.) to improve model generalization. Training was performed with the PyTorch Lightning framework for efficient development. Application The final model was integrated into a lightweight web application and deployed on Hugging Face Spaces, allowing accessible browser-based inference. The application provides an easy-to-use interface to upload images, which are then automatically cropped and analyzed in real-time. This open and free tool is intended to provide immediate classification results. It is also a useful tool for laboratory technologists without requiring specialized hardware or software. Results Testing on an independent set revealed that the ResNet-50 network reached 98.67% overall accuracy. Performance was consistently high across all eight WBC categories. Precision, recall, and specificity closely matched the overall accuracy, indicating well-balanced classification. However, for the assessment of real-world generalization, the model was tested on an external heterogeneous dataset from different sources. It performed with 86.33% accuracy, reflecting strong performance outside of its main training data. The confusion matrix showed negligible misclassifications. This suggested consistent distinction between leukocyte types. Conclusion This study indicates that a lightweight AI tool can support peripheral smear analysis by offering rapid and consistent WBC identification via a web interface. Such a system may reduce laboratory workload and observer variability, particularly in resource-limited or remote settings where expert microscopists are scarce, and serve as a practical training aid for personnel learning cell morphology. Limitations include reliance on a single dataset, which may not encompass all staining or imaging variations, and evaluation performed offline. Future work will aim to expand dataset diversity, enable real-time integration with digital microscopes, and conduct clinical validation to broaden applicability and adoption. Application link: https://huggingface.co/spaces/xDyas/wbc-classifier7 0Item Restricted AI-Based Approaches for Respiratory Disease Detection Using Audio Signals and Imaging Data(Saudi Digital Library, 2025) Shati, Asmaa; Hassan, Ghulam Mubashar; Datta, AmitavaRespiratory diseases (RDs) remain major global health concerns, typically diagnosed through imaging and auscultation, with cough sounds also offering diagnostic cues. These methods, however, are often subjective and depend on expert interpretation. Advances in machine learning (ML) enable automated RD diagnosis, yet challenges such as limited data, high computational costs, and accessibility gaps persist, underscoring the need for innovative approaches. This thesis proposes a series of novel approaches for automated RD detection, utilizing either cough audio or CXR as input modalities, selected for their availability and affordability. These approaches integrate advanced techniques for segmentation, feature extraction, and subsequent classification, offering practical and cost-effective diagnostic solutions. Extensive evaluation on multiple open-source datasets demonstrates the effectiveness of the proposed approaches across diverse diagnostic contexts.19 0Item Restricted Insider Threat Detection in a Hybrid IT Environment Using Unsupervised Anomaly Detection Techniques(Saudi Digital Library, 2025) Alharbi, Mohammed; Antonio, GouglidisThis dissertation analyses insider threat detection in hybrid IT environments with unsupervised anomaly detection techniques. Insider threats, including those committed by trusted persons with granted access, are considered to be one of the most challenging to alleviate cybersecurity threats because they resemble legal user behavior and do not have labelled datasets to train supervised models. Hybrid infrastructures, an integration of on-premise and cloud resources, also make detection harder as they create large, heterogeneous and fragmented logs. In order to cope with such challenges, this paper presents a detection system that uses isolation forest and local outlier factor algorithms. Multi-source organisational data, such as authentication, file, email, HTTP, device and LDAP logs, were pre-processed and loaded into enriched user profiles, with psychometric attributes added where possible. The framework was assessed by the CERT Insider Threat Dataset v6.2, where the results indicated that both algorithms were effective in detecting anomalous behaviours: Isolation Forest was effective in detecting global outliers, whereas Local Outlier Factor was good in detecting subtle local outliers. It was found through the comparative analysis that the strength of each method was complementary, and they should be used together when stratifying users into high-, medium-, and low-risk groups. Although it still has constraints in terms of synthetic data, real-time implementation, and ecological validity, the study is relevant in the development of anomaly-based detection methods and offers viable information to organisations wishing to be proactive in curbing insider threats39 0Item Restricted Enhancing Gravitational-Wave Detection from Cosmic String Cusps in Real Noise Using Deep Learning(Saudi Digital Library, 2025) Taghreed, Bahlool; Patrick, SuttonCosmic strings are topological defects that may have formed in the early universe and could produce bursts of gravitational waves through cusp events. Detecting such signals is particularly challenging due to the presence of transient non-astrophysical artifacts—known as glitches—in gravitational-wave detector data. In this work, we develop a deep learning-based classifier designed to distinguish cosmic string cusp signals from common transient noise types, such as blips, using raw, whitened 1D time-series data extracted from real detector noise. Unlike previous approaches that rely on simulated or idealized noise environments, our method is trained and tested entirely on real noise, making it more applicable to real-world search pipelines. Using a dataset of 50,000 labeled 2-second samples, our model achieves a classification accuracy of 84.8% , recall 78.7% and false-positive rate 9.1% on unseen data. This demonstrates the feasibility of cusp-glitch discrimination directly in the time domain, without requiring time-frequency representations or synthetic data, and contributes toward robust detection of exotic astrophysical signals in realistic gravitational-wave conditions.13 0Item Restricted Generalization of Machine-Learning in Clinical Randomized Controlled Trials: Evaluation and Development(Saudi Digital Library, 2025) ALMADHI, SHAYKHAH; Karwath, AndreasIn healthcare, machine learning (ML) shows significant promise in improving patient diagnostics, prognostics, and personalized care. However, its real-world deployment is often constrained by models' inconsistent performance on diverse and unseen patient data, a critical challenge known as generalization. Despite ongoing advancements, existing methodologies have shown only limited success in assessing and improving ML generalization, raising uncertainty in clinical deployment. This dissertation tackles this gap by presenting a robust evaluation framework and a predictive tool to cultivate more reliable healthcare AI. Applying Logistic Regression and XGBoost models on a dataset from nine double-blind, randomized, placebo-controlled trials investigating beta-blockers in heart failure. This study employs leave-one-trial-out, reverse leave- one-trial-out, and systematic evaluation to comprehensively assess generalization. The findings indicate that while generalization is often suboptimal, strategic selection of training cohorts markedly improves performance. Furthermore, a developed meta-learning framework effectively predicts model degradation. This research provides crucial insights into model generalizability across varied clinical datasets and introduces a practical pre-screening tool, essential for facilitating a safer and more effective integration of ML into clinical practice and promoting fair patient outcomes.12 0Item Restricted A CLOUD-BASED AI SYSTEM FOR SKILL GAP ANALYSIS AND TRAINING PATH RECOMMENDATION IN HR DEPARTMENTS(Saudi Digital Library, 2025) Alanazi, Abdullah Ramadan; AlYamani, AbdulghaniThis dissertation presents the development of a cloud-based artificial intelligence (AI) system designed to automate skill gap analysis and provide personalised training recommendations in Human Resource (HR) departments. The system integrates employee profiles, job role requirements, and training histories to identify competency gaps using a decision tree algorithm. The AI model achieved an accuracy of 0.86 and demonstrated strong interpretability and efficiency in recommending relevant training paths. Usability testing with HR professionals confirmed the system’s practicality and reliability in supporting workforce development and data-driven training strategies. The research contributes to the field of HR analytics by combining Human Capital Theory with Knowledge Discovery in Databases (KDD) to provide an explainable, scalable, and cloud-enabled HR decision-support framework.10 0Item Restricted Advancing narcolepsy diagnosis: Leveraging machine learning to identify novel neuro-biomarkers(Saudi Digital Library, 2024) Orkouby, Hadir; Bartsch, UllrichNarcolepsy is a rare neurological disorder with a well-identified pathophysiology that manifests as a sudden onset of sleep during wake behaviour. The current diagnostic pathways for narcolepsy involve complex assessments of sleep neurophysiology, including polysomnography and the multiple sleep latency (MSLT) test. These are cumbersome and work-intensive, and with limited resources within the NHS, this has led to increased waiting times for diagnosis and treatment of narcolepsy. This project harnessed the power of digital neuro-biomarkers and Artificial Intelligence (AI) to develop novel diagnostic markers for narcolepsy. Leveraging an open-source dataset of labelled archival polysomnography (PSG) recordings, including electroencephalography (EEG), I created a data analysis and classification pipeline to enhance diagnostic decision-making in clinical settings. This pipeline combines comprehensive data preprocessing and feature extraction with XGBoost and Random Forest (RF) classification models. The feature extraction process included selected time- series analysis features, spectral frequency ratios, cross-frequency coupling and moment-based statistical features of Intrinsic Mode Functions (IMFs) derived from empirical mode decomposition (EMD). The RF classifier emerged as the best model, achieving an accuracy of 82.5%, with a specificity of 82.5% and a sensitivity of 92.86%, by combining and averaging these feature sets and incorporating sleep stage labels during model training. These results underscore the potential of a novel approach using single-channel sleep EEG data from wearable devices. This innovative method simplifies the lengthy and costly pathway for narcolepsy diagnosis and also paves the way for developing new tools to diagnose sleep disorders automatically in non-clinical environments.14 0Item Restricted Predicting Delayed Flights for International Airports Using Artificial Intelligence Models & Techniques(Saudi Digital Library, 2025) Alsharif, Waleed; MHallah, RymDelayed flights are a pervasive challenge in the aviation industry, significantly impacting operational efficiency, passenger satisfaction, and economic costs. This thesis aims to develop predictive models that demonstrate strong performance and reliability, capable of maintaining high accuracy within the tested dataset and showcasing potential for application in various real-world aviation scenarios. These models leverage advanced artificial intelligence and deep learning techniques to address the complexity of predicting delayed flights. The study evaluates the performance of Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN), and their hybrid model (LSTM-CNN), which combine temporal and spatial pattern analysis, alongside Large Language Models (LLM, specifically OpenAI's Babbage model), which excel in processing structured and unstructured text data. Additionally, the research introduces a unified machine learning framework utilizing Gradient Boosting Machine (GBM) for regression and Light Gradient Boosting Machine (LGBM) for classification, aimed at estimating both flight delay durations and their underlying causes. The models were tested on high-dimensional datasets from John F. Kennedy International Airport (JFK), and a synthetic dataset from King Abdulaziz International Airport (KAIA). Among the evaluated models, the hybrid LSTM-CNN model demonstrated the best performance, achieving 99.91% prediction accuracy with a prediction time of 2.18 seconds, outperforming the GBM model (98.5% accuracy, 6.75 seconds) and LGBM (99.99% precision, 4.88 seconds). Additionally, GBM achieved a strong correlation score (R² = 0.9086) in predicting delay durations, while LGBM exhibited exceptionally high precision (99.99%) in identifying delay causes. Results indicated that National Aviation System delays (correlation: 0.600), carrier-related delays (0.561), and late aircraft arrivals (0.519) were the most significant contributors, while weather factors played a moderate role. These findings underscore the exceptional accuracy and efficiency of LSTM-CNN, establishing it as the optimal model for predicting delayed flights due to its superior performance and speed. The study highlights the potential for integrating LSTM-CNN into real-time airport management systems, enhancing operational efficiency and decision-making while paving the way for smarter, AI-driven air traffic systems.13 0Item Restricted Harnessing Machine Learning and Deep Learning for Analyzing Electrical Load Patterns to Identify Energy Loss(Saudi Digital Library, 2025) Alabbas, Mashhour Sadun Abdulkarim; Albatah, MohammadMeeting the challenges of energy requirements, consumption patterns, and the push for sustainability makes energy management in contemporary agriculture critically important. This study aims to devise a holistic model for energy efficiency in agricultural contexts by integrating modern computer vision methodologies for field boundary extraction together with anomaly detection techniques. To achieve the accurate segmentation of agricultural fields from satellite imagery, high-resolution imagery is processed using the YOLOv8 object detection model. The subsequently generated field feature datasets enable the smart grid data to serve as a basis for the anomaly detection process using the Isolation Forest algorithm. The methodology follows a multi-stage pipeline: data collection, preprocessing, augmentation, model training, fine-tuning, and evaluation. To validate accurate and reliable field boundary detection, evaluation metrics precision, recall, and mAP (mean Average Precision) are computed and analyzed. Subsequently, energy consumption data are processed for anomaly detection, enabling the identification of irregular and potentially inefficient consumption patterns. The findings indicate that YOLOv8 has a very high detection accuracy with an mAP score over 90%. Furthermore, the Isolation Forest algorithm has shown improved F1 scores over traditional approaches in detecting anomalies in energy consumption. This integrated method provides an automated and scalable solution in precision agriculture which allows users to monitor cultivation conditions and minimize energy consumption, thereby enhancing the energy efficiency and the overall decision-making framework. The study advances the convergence of artificial intelligence, remote sensing, and intelligent energy management systems, offering a basis for developing technological innovations that promote sustainablility in agriculture.32 0Item Restricted Predicting Client Default Payments Using Machine Learning in Production Environment(Saudi Digital Library, 2025) Alanazi, Reem; LavendiniThis project investigates the application of machine learning techniques to predict client default payments in a credit card setting. Using a dataset of 30,000 Taiwanese clients, the study addresses the challenges of class imbalance, predictive accuracy, and fairness in credit risk assessment. An XGBoost model was developed and enhanced through feature engineering, resampling techniques (SMOTE/ADASYN), and class weighting to improve recall for defaulters while maintaining overall accuracy. Interpretability was achieved using SHAP values, providing transparency into model decisions. To mitigate demographic disparities, particularly across education levels, a fairness-constrained Random Forest was integrated into a two-stage cascade framework, reducing false positives while preserving high recall. The final cascade model achieved 84% accuracy, with 93% recall for non-defaulters and 53% recall for defaulters, significantly outperforming baseline benchmarks. Fairness audits revealed that education-based disparities could be reduced with minimal performance trade-offs, while age-based fairness was largely maintained. The project demonstrates a practical, interpretable, and ethically aware pipeline for credit default prediction, with deployment considerations and directions for future research in cost-sensitive learning, advanced fairness constraints, and real-time monitoring32 0
