Saudi Cultural Missions Theses & Dissertations

Permanent URI for this communityhttps://drepo.sdl.edu.sa/handle/20.500.14154/10

Browse

Search Results

Now showing 1 - 2 of 2
  • ItemRestricted
    Early Prediction of Cancer Using Supervised Machine Learning: A Study of Electronic Health Records From The Ministry of National Gurad Health Affairs
    (University College London (UCL), 2024-08) Alfayez, Asma; Lai, Alvina; Kunz, Holger
    Early detection and treatment of cancer can save lives; however, identifying those most at risk of developing cancer remains challenging. Electronic health records (EHR) provide a rich source of "big" data on large patient numbers. I hypothesised that in the period preceding a definitive cancer diagnosis, there exist healthcare events, such as a history of disease, captured within EHR data that characterise cancer progression and can be exploited to predict future cancer occurrence. Using longitudinal phenotype data from the EHR of the Ministry of National Guard Health Affairs, a large healthcare provider in Saudi Arabia, I aimed to discover health event patterns present in EHR data that predict cancer development in periods prior to diagnosis by developing predictive models using supervised machine learning (ML) algorithms. I used two different prediction periods: six months and one year prior to cancer diagnosis. Initially, the thesis focused on the prediction of both malignant and benign neoplasms, before moving on to predicting the future risk of malignant neoplasms (cancer), since predicting life-threatening illness remains the most important clinical challenge. To refine the approach for specific cancer types, predictive models were built for the top three malignancies in this population: breast, colon, and thyroid cancers. ML predictive models were developed using the following algorithms: (1) logistic regression; (2) penalised logistic regression; (3) decision trees; (4) random forests; (5) gradient boosting; (6) extreme gradient boosting; (7) k-nearest neighbours; and (8) support vector machine. Model performance was assessed using k-fold cross-validation and area under the curve—receiver operating characteristics (AUC-ROC). After developing different models, their performance was compared with and without hyperparameter tuning using tree-based pipeline optimization (TPOT) and GridSearch. This study provides novel proof-of-principle that ML algorithms can be applied to EHR data to develop models that can be used to predict future cancer occurrence.
    26 0
  • Thumbnail Image
    ItemRestricted
    Investigating Rule Induction Methods in Machine Learning for Improving Medical Dementia Prediction
    (2023-08-04) Albalawi, Hadeel; Lambrou, Tryphon
    Alzheimer’s Disease (AD) is a neurodegenerative disease related to dementia that predominantly affects the elderly population with symptoms including, but not limited to, cognitive impairment and memory loss. Detecting AD and other conditions like Mild Cognitive Impairment (MCI) can lengthen the lifespan of patients and help them to access the medical services. One approach to achieve a rapid and early diagnosis of AD is using data mining (DM) techniques, which can search various characteristic traits related to Cognitively Normal (CN), AD, and MCI data observations to build classifiers that reveal contributors to the disease. Classifiers developed by DM techniques are used by medical professionals during dementia clinical processes to help make correct diagnosis. In this research, we amend a process based on DM that evaluates characteristics related to dementia conditions, in particular AD. The novelty of the proposed process lies in the classification algorithm that we have named ‘Rules-based Uncertainty Reduction Algorithm’ (RURA). RURA develops classifiers with rules, which strengthen the decisions that can be invoked by the medical professional when evaluating patients with dementia. Empirical evaluation, using several DM algorithms were conducted on biological marker (biomarkers) and behavioral characteristics of data subjects collected from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) data project to analyze the effectiveness of the proposed classification algorithm. The results show that RURA perform over 90% in predicting AD when compared with classifiers developed by other DM algorithms. Furthermore, the results show that delayed word recall and orientation are effective cognitive factors that contribute to the ability to detect dementia early. For the biomarkers, ABETA and neurofibrillary tangles in the neurons showed some associations with AD although fewer than those of the cognitive elements. Moreover, the results obtained show that the classifiers developed by the RURA algorithm from psychological attributes (Subset 2) are higher in accuracy than the other DM algorithms. The results from Subset 2, show that RURA’s classifiers are higher with 11.28%, 1.18%, 1.88%, 3.75%, 1.70%, and 0.27% than PRISM, Nnge, kNN, Naïve Bayes, ORule, and Ridor, separately. Also, RURA developed higher accurate classifiers than the remaining DM algorithms on Subset 12 which includes cognitive attributes and biomarker attributes. However, the performance of RURA did not improve when the general attributes were added to the psychological attributes and the accuracy decreased by 1.34% when mining Subset 2 (psychological attributes and general attributes). More considerably, the results indicate that specific biomarkers in the ADNI-MERGE dataset when used alone by a DM algorithm that we considered for dementia detection often will not result in acceptable predictive systems. For example, when DM algorithms were trained on MRI and PET attributes (Subset 5), the classifiers created had classification accuracies of 42–55% which are relatively low. So, utilizing neuroimaging attributes in the ADNI-MERGE dataset to detect dementia is not ideal using the DM techniques, especially when a baseline visit is used to represent each data subject. Therefore, the biomarkers attributes must be accompanied with psychological elements to improve the detection rate of dementia.
    12 0

Copyright owned by the Saudi Digital Library (SDL) © 2025