SACM - United Kingdom
Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9667
Browse
68 results
Search Results
Item Restricted Agarwood Quality Classification in the Middle East: A Mixed-Methods Study of Social, Sensory, and Data-Driven Insights(Saudi Digital Library, 2025) AlSalem, Fatmah; Bembibre, CeciliaThis dissertation investigates the classification of agarwood quality in the Middle East, focusing on the Gulf Cooperation Council (GCC) countries, where oud holds profound cultural, religious, and economic value. The market lacks a unified formal grading system leading to multiple discrepan- cies. Employing a mixed-methods approach, the study first conducted a sensory panel to gain relative consumer insight. Composed of both Middle Eastern and non-Middle Eastern participants, the panel revealed how quality perception varies among non-experts. Next, semantic analysis of cultural discourse was extracted from social media that was then used to design a contextualized two-layer grading system; finally, that framework was applied on an e-commerce dataset of oud products, whereby an optimized Random Forest model leveraging TF-IDF classified quality grades using textual descriptions with 90.5% accuracy. This demonstrates how efficient machine learn- ing can effectively approximate sensory and cultural judgment from text data alone. The research concludes that digital platforms are repositories of cultural knowledge, anticipating that such frame- works can provide transparent, standardized, and scalable agarwood classification—channelling tradition and innovation for a fairer, more sustainable oud market in the region.15 0Item Restricted Modelling and Optimisation of The Continuous Pharmaceutical Manufacturing Process: A New Data-Driven Approach For Right-First-Time Production(Saudi Digital Library, 2025) Deebes, Motaz; Mahfouf, MahdiPharmaceutical industries, like most industries, are subjected to stringent quality and regulatory requirements to ensure the manufacturing of safe and high-quality medicinal products. Continuous manufacturing has emerged as a transformative approach offering the potential to meet global demands of medicines through efficient and continuous processes. However, its adoption in tablet manufacturing remains constrained by the complex, multivariate behaviour of particulate processes. Moreover, the lack of comprehensive modelling frameworks further hinders understanding and control of the multistage processes. This thesis aims to develop and evaluate novel predictive modelling frameworks tailored to the continuous manufacturing of pharmaceutical tablets, using data collected from an industrial-scale pilot plant (Consigma-25) encompassing five critical unit operations. An integrated and sequential modelling framework was constructed using ensemble machine learning techniques, including gradient boosting machines and random forests, to predict key quality attributes across stages, with Gaussian mixture models incorporated to reduce uncertainties. To enhance interpretability, a hybrid modelling approach combining artificial neural networks with interval type-2 fuzzy inference system was developed. Additionally, a novel integration of Adaptive Neuro-Fuzzy Inference System with a Genetic Algorithm formed the basis of a model-informed optimisation strategy, enabling the identification of optimal process settings to control the final product quality under “Right-First-Time” manufacturing. The results demonstrate that proposed frameworks were effective in capturing the non-linearity among process parameters and quality outcomes, achieving R2 values exceeding 0.90 across the frameworks. This represents a predictive capability improvement of 56% compared with prior studies. The incorporation of interpretable, uncertainty-aware methods ensured model outputs remained effective to illustrate the processes' understanding despite complexity. The model-informed optimisation strategy was validated through practical application within the right-first-time manufacturing concept. These research findings demonstrate the potential of the proposed frameworks to advance pharmaceutical tablet manufacturing by bridging the gap between scientific research innovation and scalable industrial implementation.19 0Item Restricted AI-Based Analysis of Magnetic Nanoparticle Relaxometry Curves for Structure-Specific Cancer Detection and Classification(Saudi Digital Library, 2025) AlHumam, Malack; Hovorka, OndrejCancer remains one of the world’s leading causes of death, and the key to successful treatment relies heavily on early and accurate diagnosis. This thesis explores a minimally invasive diagnostic method by combining magnetorelaxometry (MRX) with artificial intelligence (AI). Magnetorelaxometry measures how magnetic nanoparticles relax after being excited by an external magnetic field, producing relaxation curves that depend on anisotropy orientation and variation, particle number, structure geometry. Among magnetic nanoparticles, superparamagnetic iron oxide nanoparticles (SPIONs) are particularly suited for biomedical applications due to their biocompatibility and tunable relaxation properties. However, these curves often overlap and appear indistinguishable to the human eye, making traditional analysis challenging. The central research question of this thesis is whether AI can classify nanoparticle ensembles by structure and particle number from their relaxation curves, using them as unique markers for cancer detection and classification. To address this, five simulated datasets were generated, each incorporating multiple structures with different particle numbers under varying anisotropy conditions. After preprocessing, the data were analyzed with supervised, semi-supervised, and unsupervised models, supported by dimensionality reduction visualizations (PCA, t-SNE, UMAP). Supervised models achieved the strongest performance, with multiclass logistic regression reaching an accuracy of 0.89 in the dataset with aligned anisotropy and no variation. ZChains consistently emerged as the most distinguishable ensembles, relaxing roughly twice as long as YChains and providing clearer separability in both geometry and particle number, as confirmed by PCA scatter plots. In contrast, YChains frequently collapsed under z-axis anisotropy alignment, while Triangles and Rings were distinguishable only under controlled anisotropy variation. Arkus structures degraded rapidly when anisotropy variation increased. Semi-supervised pseudo-labeling maintained comparable accuracy of 0.817 under limited labeling, while unsupervised KMeans clustering, although non-predictive, provided insights into ensemble overlap and natural similarity groupings. The main contribution of this work is the demonstration that AI can classify nanoparticle ensembles through relaxation curve morphology rather than biomarker binding assays. This represents a shift from proof of detection toward structure-based classification, bridging magnetic physics with biomedical AI applications. Future directions include aligning anisotropy axes experimentally, exploring relaxation saturation for cancer staging, and translating AI pipelines to real biological magnetorelaxometry data.4 0Item Restricted Matrix Factorisation for Movie Recommender Systems: Enhancing Collaborative Filtering with Side Information through Evaluating Baseline, Joint, and Collective Matrix Factorisation(2025-11-20) AlMalki, Shurooq; Virtanen, SeppoIn an era where online content is constantly increasing, particularly in the movie domain, people get overwhelmed by the overload of the available choices. One of the most prominent tools that have been used to overcome such a challenge are movie recommender systems, as they provide users with personalised suggestions tailored to their preferences. Movie recommender systems follow three main filtering methods: collaborative filtering, content-based filtering and hybrid filtering. This dissertation focuses on investigating matrix factorisation techniques for collabora- tive filtering, comparing three complementary approaches: baseline Matrix Factorisation (MF), Joint Matrix Factorisation (JMF), and Collective Matrix Factorisation (CMF) to evaluate their performances on both warm and cold start scenarios. The core concept of this dissertation is enhancing collaborative filtering through incorporating side information with matrix factorisa- tion and observe the effect on the models’ prediction accuracy as well as the recommendation quality. Using the MovieLens 1M dataset, three movie recommendation models were developed and evaluated in terms of prediction accuracy and handling the cold start problem. Although col- laborative filtering is widely used in movie recommender systems, it presents major challenges which are the high sparsity of the user-item matrices and the cold-start problem. The baseline MF model was applied using Singular Value Decomposition (SVD). The Joint MF model was also applied using SVD, leveraging the demographic side information by combining them into the rating matrix. The last approach was the Collective MF, using the cmfrec package [2] to simultaneously factorise ratings and demographics matrices. The evaluation measures followed include both the rating prediction metrics; mean squared error (MSE) and mean absolute error (MAE), and top-N recommendation metrics including Precision@N and Recall@N. The cold start problem was addressed by varying proportions of observed versus missing values and across multiple values of latent factors. Results revealed that the baseline MF model scored competitive accuracy, however, JMF model outperformed both baseline MF and CMF models and showed improved prediction accuracy in both warm and cold start scenarios, highlighting the importance of integrating side information with the latent factor models. The CMF model, although scored better than the baseline MF, returned mixed results, indicating the complexity of the model and the need for more tuning.7 0Item Restricted COPD-Aware Modelling of Heart Failure Hospital Admissions Using Routinely Collected Primary Care Prescription Data(Saudi Digital Library, 2025) Alghamdi, Taghreed Safar; AishwaryaprajnaHeart failure (HF) is a leading cause of unplanned hospital admissions in the United Kingdom (UK), consuming 1–2% of the National Health Service (NHS) annual budget, with most costs from inpatient care. Many predictive models oversimplify medication histories, relying on static indicators instead of time-aware prescribing patterns. This study improves HF admission prediction using UK primary care data, focusing on monthly dosage trends of three therapeutic classes angiotensin converting en- zyme inhibitors (ACEIs), beta-blockers, and angiotensin receptor blockers (ARBs) and the influence of Chronic Obstructive Pulmonary Disease (COPD). Three linked datasets patient demographics and comorbidities (patientinfo), prescription records (prescriptions), and chronic condition diagnoses (indexdates) were merged after cleaning and validation. Static attributes and temporal medication features were used to train Long Short-Term Memory (LSTM) networks, Random Forest, and Logistic Regression. Due to poor performance of the LSTM and Random Forest in a multi-class setting (ad- mission count categories), the task was reframed as binary classification (admission vs. no admission), with class imbalance addressed using the Synthetic Minority Oversampling Technique (SMOTE). The final dataset included 963 patients and over 109521 monthly prescription records. The best perfor- mance was from standard Random Forest (without SMOTE), which retained clinical interpretability, identifying COPD status, total monthly medication dosage, and age at HF diagnosis as top predic- tors. COPD patients had a 12% higher admission rate (59.1% vs. 41.8%). These findings show that granular, dosage-aware prescribing data can enhance HF admission prediction. Future work will ex- plore hybrid classification regression models, incorporate laboratory and lifestyle data, and validate externally to improve generalisability and support NHS decision-making.9 0Item Restricted Combining Traditional and Machine Learning Approaches to Predict TCGA Colon Cancer Outcomes(Saudi Digital Library, 2025) Alotaibi, Reem; MONDAL, SUDIPThis study utilises standardised clinical data from the Cancer Genome Atlas Colon Adenocarcinoma (TCGA-COAD) cohort to perform a comparative survival analysis of colorectal cancer (CRC). Three modelling approaches were evaluated: the Cox Proportional Hazards (Cox PH) model, LASSO-penalised Cox regression, and the Gradient Boosted Survival Model (GBSM). Models were trained and evaluated using the concordance index (C-index) and time-dependent area under the curve (AUC) following comprehensive data preprocessing, including missing value imputation, outlier removal, and Kaplan–Meier–based variable stratification. LASSO-Cox improved model sparsity and feature selection (C-index = 0.80), while Cox PH demonstrated consistent identification of clinically established predictors with strong interpretability (C-index = 0.76). GBSM achieved the highest predictive performance (C-index = 0.87; AUC = 0.841) by effectively modelling complex non-linear relationships. Model interpretability was enhanced using SHAP values, which highlighted key prognostic factors, including tumour staging components (T4, N2, M1), as well as underexplored but clinically meaningful variables such as residual tumour status (R2), age at diagnosis, and ethnicity. These findings demonstrate the potential of interpretable machine learning models to improve survival prediction and feature discovery in colorectal cancer. The study highlights the importance of external validation and multimodal data integration to enhance generalisability and translational relevance in precision oncology.10 0Item Restricted Risk Factor Analysis and Prediction of Chronic Kidney Disease Using Clinical Data from Indian Patients(Saudi Digital Library, 2025) Alkhunaizan, Sarah; Claudio, FronterreChronic kidney disease (CKD) is a progressive condition that is frequently underdiagnosed as it is asymptomatic in early stages, creating a need for reliable prediction tools to support earlier identification and intervention. This study aimed to (1) identify key clinical and demographic factors associated with CKD and (2) develop and compare predictive models by applying routinely collected health data. Analysis was conducted using a real-world clinical dataset from Apollo Hospitals in Tamil Nadu, India, made publicly available via the UCI Machine Learning Repository (n = 397; 25 variables; binary outcome: CKD vs non-CKD). To reduce data leakage and focus on disease prediction, direct diagnostic biomarkers (serum creatinine, blood urea, and urine albumin) were excluded. Missingness (10.5%) was assessed, Little’s MCAR test rejected MCAR, and regression findings were consistent with a MAR mechanism; four strategies were compared (complete-case analysis, deterministic, stochastic, and random forest imputation), with random forest imputation selected for subsequent analyses. Exploratory analyses described distributions and associations, and correlated predictors were removed to mitigate multicollinearity. Three models, LASSO logistic regression, decision tree (CART), and XGBoost, were trained using a 70/30 train-test split with 10-fold cross-validation and evaluated using accuracy, sensitivity, specificity, ROC-AUC, and calibration. XGBoost achieved the best discrimination (accuracy 96.6%, AUC 0.991), while the decision tree demonstrated the strongest calibration. Across models, the most influential predictors consistently included red blood cell count, hypertension, diabetes mellitus, sodium, abnormal urinary red blood cells, and appetite. These findings support the utility of machine learning models, particularly XGBoost, for early CKD risk prediction using routine clinical data, while highlighting the importance of robust preprocessing and validation to improve clinical applicability.7 0Item Restricted Machine Learning Techniques for Financial Loan Default Prediction in UK: A Comparative Analysis of Decision Tree and Random Forest Models(Saudi Digital Library, 2025) Alrakan, Fahad Abdulaziz; Alwzinani, FarisThis dissertation proposes a comprehensive approach to variable selection and model comparison applied to credit scoring, based on a Lending Club 2016–2018 dataset. The methodology combines an initial manual selection, based on completeness and business logic, followed by an automatic selection via RFECV (Recursive Feature Elimination with Cross-Validation) using a Random Forest. Finally, an importance permutation analysis and an ablation experiment (Top 10 variables) complete the evaluation. The results show that all 21 variables selected are considered relevant by RFECV, but that most of the predictive power is concentrated in a subset of about 15 variables. A comparison of the models highlights the clear superiority of Random Forest (AUC ≈ 0.713; PR-AUC ≈ 0.437) over Decision Tree (AUC ≈ 0.594; PR-AUC ≈ 0.319). Permutation importance analysis confirms business intuition: interest rate, credit sub- grade, and residential status appear to be the main explanatory factors, supplemented by financial indicators (debt ratio, loan amount, FICO score). The ablation experiment shows that these ten main variables are sufficient to preserve almost all of the Random Forest's performance (AUC = 0.708), while reducing training time by approximately 40%. These results highlight two major points: (i) Random Forest is robust and capable of effectively exploiting a small core of variables, but its performance remains below the standards expected for an industrial model (>0.80 AUC); (ii) the hierarchy of variables reveals both the relevance of expected indicators and the redundancy between certain correlated measures. The limitations identified concern sensitivity to correlations, the temporal restriction of the sample (2016–2018), and the computational cost of certain steps (RFECV). In conclusion, this project validates the feasibility of a robust and parsimonious model based on Random Forest, while opening up prospects for improvement: use of boosting algorithms, calibration of thresholds according to economic issues, temporal robustness tests, and pipeline optimization.6 0Item Restricted The Additional Regulatory Challenges Posed by AI In Financial Trading(Saudi Digital Library, 2025) Almutairi, Nasser; Alessio, AzzuttiAlgorithmic trading has shifted from rule-based speed to adaptive autonomy, with deep learning and reinforcement learning agents that learn, re-parameterize, and redeploy in near real time, amplifying opacity, correlated behaviours, and flash-crash dynamics. Against this backdrop, the dissertation asks whether existing EU and US legal frameworks can keep pace with new generations of AI trading systems. It adopts a doctrinal and comparative method, reading MiFID II and MAR, the EU AI Act, SEC and CFTC regimes, and global soft law (IOSCO, NIST) through an engineering lens of AI lifecycles and value chains to test functional adequacy. Chapter 1 maps the evolution from deterministic code to self-optimizing agents and locates the shrinking space for real-time human oversight. Chapter 2 reframes technical attributes as risk vectors, such as herding, feedback loops, and brittle liquidity, and illustrates enforcement and stability implications. Chapter 3 exposes human-centric assumptions (intent, explainability, “kill switches”) embedded in current rules and the gaps they create for attribution, auditing, and cross-border supervision. Chapter 4 proposes a hybrid, lifecycle-based model of oversight that combines value-chain accountability, tiered AI-agent licensing, mandatory pre-deployment verification, explainability XAI requirements, cryptographically sealed audit trails, human-in-the-loop controls, continuous monitoring, and sandboxed co-regulation. The contribution is threefold: (1) a technology-aware risk typology linking engineering realities to market integrity outcomes; (2) a comparative map of EU and US regimes that surfaces avenues for regulatory arbitrage; and (3) a practicable governance toolkit that restores traceable accountability without stifling beneficial innovation. Overall, the thesis argues for moving from incremental, disclosure-centric tweaks to proactive, lifecycle governance that embeds accountability at design, deployment, and post-trade, aligning next-generation trading technology with the enduring goals of fair, orderly, and resilient markets.11 0Item Restricted Semi-Supervised Approach For Automatic Head Gesture Classification(Saudi Digital Library, 2025) Alsharif, Wejdan; Hiroshi, ShimodairaThis study utilizes a semi-supervised method, particularly self-training, for automatic head gesture recognition using motion caption data. It explores and compares fully supervised deep learning models and self-training pipelines in terms of their perfor- mance and training approaches. The proposed approach achieved an accuracy score of 52% and a macro F1 score of 44% in the cross validation. Results have shown that leveraging self-training as part of the learning process contributes to improved model performance, due to generating pseudo-labeled data that effectively supplements the original labeled dataset, thereby enabling the model to learn from a larger and more diverse set of training examples.5 0
