SACM - Australia
Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9648
Browse
2 results
Search Results
Item Restricted Reducing Type 1 Childhood Diabetes in Saudi Arabia by Identifying and Modelling Its Key Performance Indicators(Royal Melbourne Institute of Technology, 2024-06) Alazwari, Ahood; Johnstone, Alice; Abdollahain, Mali; Tafakori, LalehThe increasing incidence of type 1 diabetes (T1D) in children is a growing global health concern. Reducing the incidence of diabetes generally is one of the goals in the World Health Organisation’s (WHO) 2030 Agenda for Sustainable Development Goals. With an incidence rate of 31.4 cases per 100,000 children and an estimated 3,800 new cases per year, Saudi Arabia is ranked 8th in the world for number of T1D cases and 5th for incidence rate. Despite the remarkable increase in the incidence of childhood T1D in Saudi Arabia, there is a lack of meticulously carried out research on T1D in children when compared with developed countries. In addition, it is crucial to recognise the critical gaps in current understanding of diabetes in children, adolescents, and young adults, with recent research indicates significant global and sub-national variations in disease incidence. Better knowledge of the development of T1D in children and its associated factors would aid medical practitioners in developing intervention plans to prevent complications and address the incidence of T1D. This study employed statistical, machine learning and classification approaches to analyse and model different aspects of childhood T1D using local case and control data. In this study, secondary data from 1,142 individual medical records (359-377 cases and 765 controls) collected from three cities located in different regions of Saudi Arabia have been used in the analysis to represent the country’s diverse population. Case and control data matched by birth year, gender and location were used to control confounders and create a more robust and clinically relevant model. It is well documented that genetic and environmental factors contribute to childhood T1D so a wide range of potential key performance indicators (KPIs) from the literature were included in this study. The collected data included information on socioeconomic status, potential genetic and environmental factors, and demographic data such as city of residence, gender and birth year. Several techniques, such as cross-validation, hyperparameter tuning and bootstrapping, were used in this study to develop models. Common statistical metrics (coefficient of determination, R-squared, root mean squared error, mean absolute error) were used to evaluate performance for the regression models while for the classification models accuracy, sensitivity, precision, F score and area under the curve were utilised as performance measures. Multiple linear regression (MLR), artificial neural network (ANN) and random forest (RF) models were developed to predict the age at onset of T1D for all children 0-14 years old, as well as for the most common age group for onset, the 5-9 year olds. To improve the performance of the MLR models, interactions between variables were considered. Additionally, risk factors associated with the age at onset of T1D were identified. The results showed that MLR and RF outperformed ANN. The logarithm of age at onset was the most suitable dependent variable. RF outperformed the others for the 5-9 years age group. Birth weight, current weight and current height influenced the age at onset in both age groups. However, preterm birth was significant only in the 0-14 years cohort, while consanguineous parents and gender were significant in the 5-9 age group. Logistic regression (LR), random forest (RF), support vector machine (SVM), Naive Bayes (NB) and artificial neural network (ANN) models were utilised with case and control data to model the development of childhood T1D and to identify its key performance indicators. Full and reduced models were developed to determine the best model. The reduced models were built using the significant factors identified by the individual full model. The study found that full LR had the highest accuracy. Full RF and SVM with a linear kernel also performed well. Significant risk factors identified as being associated with developing childhood T1D include early exposure to cow’s milk, high birth weight, positive family history of T1D and maternal age over 25 years. Poisson regression (PR), RF, SVM and K-nearest neighbor (KNN) were then used to model the incidence of childhood T1D, taking in the identified significant risk factors. The interactions between variables were also considered to enhance the performance of the models. Both full and reduced models were created and compared to find the best models with the minimum number of variables. The full Poisson regression and machine learning models outperformed all other models, but reduced models with a combination of only two out of three independent variables (early exposure to cow’s milk, high birth weight and maternal age over 25 years) also performed relatively well. This study also deployed optimisation procedures with the reduced incidence models to develop upper and lower yearly profile limits for childhood T1D incidence to achieve the United Nations (UN) and Saudi recommended levels of 264 and 339 cases by 2030. The profile limits for childhood T1D then allowed us to model optimal yearly values for the number of children weighing more than 3.5kg at birth, the number of deliveries by older mothers and the number of children introduced early to cow’s milk. The results presented in this thesis will guide healthcare providers to collect data to monitor the most influential KPIs. This would enable the initiation of suitable intervention strategies to reduce the disease burden and potentially slow the incidence rate of childhood T1D in Saudi Arabia. The research outcomes lead to recommendations to establish early intervention strategies, such as educational campaigns and healthy lifestyle programs for mothers along with child health mentoring during and after pregnancy to reduce the incidence of childhood T1D. This thesis has contributed to new knowledge on childhood T1D in Saudi Arabia by: * developing a predictive model for age at onset of childhood T1D using statistical and machine learning models. * predicting the development of T1D in children using matched case-control data and identifying its KPIs using statistical and machine learning approaches. * modeling the incidence of childhood T1D using its associated significant KPIs. * developing three optimal profile limits for monitoring the yearly incidence of childhood T1D and its associated significant KPIs. * providing a list of recommendations to establish early intervention strategies to reduce the incidence of childhood T1D.13 0Item Restricted Deep Discourse Analysis for Early Prediction of Multi-Type Dementia(Saudi Digital Library, 2023-06-12) Alkenani, Ahmed Hassan A; Li, YuefengAgeing populations are a worldwide phenomenon. Although it is not an inevitable consequence of biological ageing, dementia is strongly associated with increasing age, and is therefore anticipated to pose enormous future challenges to public health systems and aged care providers. While dementia affects its patients first and foremost, it also has negative associations with caregivers’ mental and physical health. Dementia is characterized by irreversible gradual impairment of nerve cells that control cognitive, behavioural, and language processes, causing speech and language deterioration, even in preclinical stages. Early prediction can significantly alleviate dementia symptoms and could even curtail the cognitive decline in some cases. However, the diagnostic procedure is currently challenging as it is usually initiated with clinical-based traditional screening tests. Typically, such tests are manually interpreted and therefore may entail further tests and physical examinations thus considered timely, expensive, and invasive. Therefore, many researchers have adopted speech and language analysis to facilitate and automate its initial prescreening. Although recent studies have proposed promising methods and models, there is still room for improvement, without which automated pre-screening remains impracticable. There is currently limited empirical literature on the modelling of the discourse ability of people with prodromal dementia stages and types, which is defined as spoken and written conversations and communications. Specifically, few researchers have investigated the nature of lexical and syntactic structures in spontaneous discourse generated by patients with dementia under different conditions for automated diagnostic modelling. In addition, most previous work has focused on modelling and improving the diagnosis of Alzheimer’s disease (AD), as the most common dementia pathology, and neglect other types of dementia. Further, current proposed models suffer from poor performance, a lack of generalizability, and low interpretability. Therefore, this research thesis explores lexical and syntactic presentations in written and spoken narratives of people with different dementia syndromes to develop high-performing diagnostic models using fusions of different lexical and syntactic (i.e., lexicosyntactic) features as well as language models. In this thesis, multiple novel diagnostic frameworks are proposed and developed based on the “wisdom of crowds” theory, in which different mathematical and statistical methods are investigated and properly integrated to establish ensemble approaches for an optimized overall performance and better inferences of the diagnostic models. Firstly, syntactic- and lexical-level components are explored and extracted from the only two disparate data sources available for this study: spoken and written narratives retrieved from the well-known DementiaBank dataset, and a blog-based corpus collected as a part of this research, respectively. Due to their dispersity, each data source was independently analysed and processed for exploratory data analysis and feature extraction. One of the most common problems in this context is how to ensure a proper feature space is generated for machine learning modelling. We solve this problem by proposing multiple innovative ensemble-based feature selection pipelines to reveal optimal lexicosyntactics. Secondly, we explore language vocabulary spaces (i.e., n-grams) given their proven ability to enhance the modelling performance, with an overall aim of establishing two-level feature fusions that combine optimal lexicosyntactics and vocabulary spaces. These fusions are then used with single and ensemble learning algorithms for individual diagnostic modelling of the dementia syndromes in question, including AD, Mild Cognitive Impairment (MCI), Possible AD (PoAD), Frontotemporal Dementia (FTD), Lewy Body Dementia (LBD), and Mixed Dementia (PwD). A comprehensive empirical study and series of experiments were conducted for each of the proposed approaches using these two real-world datasets to verify our frameworks. Evaluation was carried out using multiple classification metrics, returning results that not only show the effectiveness of the proposed frameworks but also outperform current “state-of-the-art” baselines. In summary, this research provides a substantial contribution to the underlying task of effective dementia classification needed for the development of automated initial pre-screenings of multiple dementia syndromes through language analysis. The lexicosyntactics presented and discussed across dementia syndromes may highly contribute to our understanding of language processing in these pathologies. Given the current scarcity of related datasets, it is also hoped that the collected writing-based blog corpus will facilitate future analytical and diagnostic studies. Furthermore, since this study deals with associated problems that have been commonly faced in this research area and that are frequently discussed in the academic literature, its outcomes could potentially assist in the development of better classification models, not only for dementia but also for other linguistic pathologies.18 0