SACM - United Kingdom

Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9667

Browse

Search Results

Now showing 1 - 10 of 18
  • ItemRestricted
    Enhancing DDoS attack Detection using Machine Learning and Deep Learning Models
    (University of Warwick, 2023-09-26) AlObaidan, Fatimah; Raza, Hassan
    Technology has become an essential part of our daily lives, indispensable for both individuals and enterprises. It facilitates the exchange of an extensive range of information across different spaces. However, Internet security is a critical challenge in today's digital age with growing dependence on IT services. Thus, various network environments can be vulnerable to attacks, causing resource depletion and hindering support for legitimate users. One of these attacks is the Distributed Denial of Service (DDoS) attack. The nature of this type of attack is such that it impacts the availability of the system. The impact to confidentiality is primary due to threat actors using DDoS as method to create chaos whilst lunching cyber-attacks on other part of infrastructures. Therefore, it is essential that DDoS attacks required sharper focus from a research perspective. The network intrusion detection system (NIDSs) are important tool to detect and monitor the network environment from DDoS attacks. However, NIDS tools suffer from several limitation such as detecting new attack and misclassified attacks. Therefore, Machine Learning (ML) and Deep Learning (DL) models are increasingly being used for automated detection of DDoS attacks. While several related works deployed ML for NIDS, most of these approaches ignore the appropriate pre-processing and overfitting problem during the implementation of ML algorithms. As a result, it can impact the robustness of the anomaly detection system and lead to poor model performance for zero-day attacks. In this research study, the researcher is proposing a new ML and DL approach based on hybrid feature selection and appropriate pre-processing operation to classify the network flow into normal or DDoS attacks. The results of the experiments carried out by researcher suggest the efficiency and the reliability of the proposed lightweight models in achieving high detection rate while minimising the detection time with less number of features. This project complies with following two CyBOK Skills areas: Network Security: The project evaluates the network security and introduces efficient, lightweight models for DDoS attack detection. Security Operations and Incident Management: The project enhances incident management capabilities by crafting ML that monitors network flows within NIDS.
    19 0
  • Thumbnail Image
    ItemRestricted
    Type 2 Diabetes Diagnoses and Tracking Mobile Application
    (University of Sheffield, 2023-09-13) Albalwi, Khawlah; Lanfranchi, Vitaveska
    Diabetes is becoming more common worldwide, demanding careful management to prevent health hazards. Machine Learning will be used to construct a complete software that can categorise people as diabetic or not using many health markers. This endeavour aims to enhance diabetics’ well-being and preventative treatment. The study has numerous objectives. A strong Machine Learning Classifier is proposed to identify diabetics from non-diabetics based on health parameters. Precision health categorisation technology may improve diabetes care. The application envisages the integration of a Step Count Tracker within the application framework. This component is crucial for diabetes therapy since it accurately tracks and documents physical activity. A thorough Calorie Calculator improves the app. This software provides calorie information to aid healthy eating. This tool helps users manage their nutritional intake to promote healthy eating. Finally, the research will develop a Hybrid Method that combines algorithms for application efficiency and precision. This novel strategy may raise the bar for health categorisation applications by enhancing performance and accuracy. These innovative traits make this initiative a health management technology lighthouse. It pioneers proactive diabetes control and combines technology and healthcare to make society healthier.
    20 0
  • Thumbnail Image
    ItemRestricted
    Predicting Customer Attrition in B2B SaaS Using Machine Learning Classification
    (Saudi Digital Library, 2023-09-15) Alalawi, Zainab; Fiaschetti, Maurizio
    Customer retention and customer loss are crucial metrics in subscription-based industries like SaaS companies. Customer discharge is a significant concern for this type of business, as clients have the flexibility to terminate the service at any time. This can lead to adverse effects on the company’s revenue stream. If SaaS businesses can accurately predict the number of customers who will cancel their subscriptions and those who will continue using their services within a specific timeframe, they can more effectively forecast their revenue, cash flow, and any future growth plan accordingly. Predicting subscription renewals and cancelations remains a challenging problem for any SaaS company. However, with the ongoing advancement in machine learning and artificial intelligence, the potential for accurately forecasting this issue has significantly improved. The study examines customer attrition and customer retention prediction in a quantitative method by utilizing several different machine learning algorithms with Python, namely Logistics regression, Naïve Baye, and random forest algorithms. Data was collected from the case company’s database and manipulated to fit the algorithms. The dataset includes the customers' business data such as spend, customer platform usage data, customer service history data, and the date of the next payment. To identify the best hyperparameters for each machine- learning algorithm, A tuning technique, in particular Grid Search, was employed. Subsequently, the algorithm models were trained and assessed using optimized hyperparameters on the fitted data. Once the models were trained, they were applied to test data to obtain the analysis results. The model’s performance was measured on the quantitative model performance metrics. including F1-Score, Area under Curve (AUC), and Accuracy.
    39 0
  • Thumbnail Image
    ItemRestricted
    A Novel Machine Learning Approach For File Fragments Classification
    (Universty of East Anglia, 2022-10-26) Algurashi, Alia; Wang, Wnjia
    Identifying types of manipulated or corrupted file fragments in isolation from their context is an essential task in digital forensics. In traditional file type identification, metadata, such as file extensions and header and footer signatures, is used. Traditional metadata-based approaches do not work where metadata is missing or altered, therefore some alternative strategies and approaches need to be applied or developed to solve the problem. One approach is to apply some statistical techniques to extract features from the binary contents of file fragments and then use them as inputs for classification algorithms. This results in high dimensionality, causing learning and classification to be time-consuming. Another approach is deep learning neural networks, which extract features automatically. File fragment classification is further complicated by the high number of possible file classes. Also, some container file types, such as Powerpoint (PPT) include data belonging to other file types, such as JPEG, which can confuse the classification algorithms. In this thesis, we developed a hybrid method to address high feature dimensionality. We use filters and wrappers to reduce the number of features. We explored the possible hierarchical relationships between file classes and we represent them with a hierarchy tree to help narrow the uncertainties for challenging file types. We proposed a novel hybrid approach that combines hierarchical models with feature selection to improve the accuracy of file fragment classification. We also explored the use of deep learning techniques for this task. We test our methods using a benchmark dataset - GovDocs. The results from hybrid feature selection show a reduction in the number of features from 66,313 to 11–32, and provide improved accuracy compared to methods using all features. The accuracy increased from 69% using random forest to 75% using the DAG tree. We incorporate the hybrid feature selection into hierarchical modelling to generate trees that use only the most discriminative features. We find that these models outperformed classical machine-learning approaches. Finally, using deep learning for file fragment classification provided the highest accuracy of all techniques explored, obtaining accuracies of 86%.
    13 0
  • Thumbnail Image
    ItemRestricted
    CAUSAL LEARNING IN UNMANNED/AUTONOMOUS VEHICLE DYNAMICS
    (Saudi Digital Library, 2023-11-23) Alwalan, Abdulaziz Abdulmohsen; Arana-Catania, Miguel
    With the ascent of unmanned aerial vehicles (UAVs) in commercial and research sectors, addressing their susceptibilities to wind disturbances becomes paramount. Current methodologies, though effective, often hinge on specialized sensors, thereby adding to the UAV's weight and compromising its functionality. This thesis explores a recently proposed approach called "causal curiosity", using machine learning (ML) to identify varying wind conditions solely from a UAV's position trajectory, circumventing the need for dedicated wind speed sensors. Through the application of time series classification combined with the intrinsic "causal curiosity" reward system, the research delves into discerning three distinct wind environments: constant wind, shear wind, and turbulence. Ultimately, autonomous UAVs can employ this paper's findings to design optimal trajectories in challenging weather conditions.
    17 0
  • Thumbnail Image
    ItemRestricted
    Medical Screening Assistant: A Chatbot to Help Nurses
    (Saudi Digital Library, 2023-11-08) Al Rabeyah, Abdullah Saleh; Da Silva, Rogerio E; Goes, Fabricio
    Over the last several years, Machine Learning has emerged as a key player in the healthcare industry. The use of chatbots is a notable application of artificial intelligence within the field of healthcare. The advent of the ChatGPT revolution represents a significant breakthrough in the realm of natural language processing, a fundamental aspect of chatbot programming. This development has simplified the implementation of GPT to engage in user communication and fulfill the objectives of the application. The objective of this project is to reduce the excessive workloads faced by healthcare professionals and enhance the efficiency of decision-making processes. This will be achieved via the development of an intelligent medical chatbot as a mobile application, specifically designed to support nurses in conducting early patient diagnoses by analyzing symptoms. The chatbot uses Swift programming language for the iOS front-end and Python with Flask for the backend. It incorporates the ChatGPT API and machine learning models to effectively comprehend and interpret user inquiries. This project uses a Kaggle dataset of 41 distinct diseases along with their corresponding symptoms. The model is trained using Logistic Regression to predict the prognosis. The responsibility of managing the dialogue between the user and the chatbot, leading up to the compilation of the definitive list of symptoms shown by the patient, lies with ChatGPT. The use of a Flask RESTful API facilitates direct interaction between the iOS application and the server-side infrastructure. Finally, the application will provide the nurse with the five most probable prognoses, along with the prediction confidence scores, depending on the symptoms supplied. Additionally, the application will offer a description of the disease and provide precautionary measures for the patient.
    20 0
  • Thumbnail Image
    ItemRestricted
    Pattern Recognition & Predictive Analysis of Cardiovascular Diseases: A Machine Learning Approach
    (Saudi Digital Library, 2023-11-23) Alseraihi, Faisal Fahad; Naich, Ammar
    Cardiovascular disease (CVD) is a predominant global health concern, with its impact becoming increasingly pronounced in low- and middle- income countries due to challenges like limited healthcare access, inadequate public awareness, and lifestyle-related risks. Addressing CVD's multifactorial origins, which span genetic, environmental, and behavioral domains, requires advanced diagnostic techniques. This research leverages the UCI Heart Disease dataset to develop a deep learning predictive model for CVD, incorporating 14 vital heart health parameters. The models performance is critically assessed against conventional machine learning approaches, shedding light on its efficiency and areas of refinement. Utilizing sophisticated Neural Network structures, this study strives to enhance predictive health analytics, aiming for timely CVD identification and intervention. As the integration of machine learning into healthcare deepens, it's crucial to ensure that these tools are robust, thoroughly evaluated, and augment clinical insights to reduce misdiagnosis risks.
    77 0
  • Thumbnail Image
    ItemRestricted
    Crisis Detection from Arabic Social Media
    (University of Birmingham, 2023-09-12) Alharbi, Alaa; Lee, Mark
    Social media (SM) streams such as Twitter provide large quantities of real-time information about emergency events from which valuable information can be extracted to enhance situational awareness and support humanitarian response efforts. The timely extraction of crisis-related SM messages is challenging as it involves processing large quantities of noisy data in real time. Supervised machine learning classifiers are challenged by out-of-distribution learning when classifying unseen (new) crises due to data variations across events. Besides that, it is impractical to label training data from each novel and emerging crisis since obtaining sufficient labelled data is time-consuming and labour-intensive. This thesis addresses the problem of Twitter crisis classification using supervised learning methods to identify crisis-related data and categorising them into different information types in the multi-source (training data from multiple events) setting. Due to Twitter’s ubiquity during emergency events in the Arab world, the current research focuses on Arabic Twitter content. We have created and published a large-scale Arabic Twitter corpus of crisis events. The corpus has been analysed and manually labelled. Analysing the content includes investigating the main information categories of conversations posted during a range of crisis events using natural language processing techniques. Building these resources is considered one of this thesis’s contributions. The thesis also investigates the generalisation performance of different supervised classical machine learning and deep learning approaches trained on out-of-crisis data to classify unseen crises. We find that deep neural networks such as LSTM and CNN outperform the classical machine learning classifiers such as support vector machines and decision trees. We also evaluate different architectures of deep neural networks and several pre-trained text representations (embeddings) learnt from vast amounts of unlabelled text. Results show that BERT-based models are more robust to out-of-distribution target events and remarkably outperform other models on the information classification task. Experiments show that the performance of BERT-based classifiers can be enhanced when training on similar data. Thus, the last contribution of the present study is to propose an instance distance-based data selection approach for adaptation to improve classifiers’ performance under a domain shift. Using the BERT embeddings, the method selects a subset of multi-event training data that is most similar to the target event. Results show that fine-tuning a BERT model on a selected subset of data to classify crisis tweets outperforms a model that has been fine-tuned on all available source data.
    25 0
  • Thumbnail Image
    ItemRestricted
    Supervised Machine Learning Assessment of Dementia Using Feature Selection Filter Methods
    (Spring Nature, 2023-10-30) Rajab, Mohammed Dabash; Wang, Dennis
    The prevalence of dementia is increasing globally. Due to the massive resources required, this issue is pressuring governments and private healthcare systems. Accurate diagnosis by clinicians on the cause of dementia, such as Alzheimer’s disease (AD), is difficult because of the time and assessments needed like neuropathological. The issue becomes more challenging when considering if various brain lesions contribute to the pathological assessment of dementia, the relationship of these lesions to the various dementia conditions, how they interact, and how to quantify them. Thereby, systematically assessing neuropathological measures by their degree of association with dementia, especially AD, may lead to better diagnostic systems and treatment targets. One promising approach that can answer these challenges is to develop data-driven solutions with core functions of feature evaluation and automatic subject classification based on machine learning (ML). Recent research studies in medical diagnosis, including dementia research, reveal that ML techniques, when used with feature selection, can identify critical features of Alzheimer-related pathologies and their association with the disease’s diagnosis and prognosis. The feature selection removes noisy features from the dementia data to increase the predictive performance and improve interpretability while reducing the dimensionality and computational complexity. However, filter-based feature selection methods can generate dissimilar feature rankings and may be sensitive to the correlations among themselves. This thesis investigates dementia with a focus on AD neuropathological assessments from a data-driven perspective to develop mechanisms to assist pathologists during these clinical assessments. The thesis investigation comprises phases such as feature ranking, feature-feature correlation, and classification. The work determines the impact of neuropathological feature-features correlations on the feature ranking for better biomarker identification. The investigation assesses real datasets related to dementia, the Cognitive Function and Aging Studies (CFAS) and the Alzheimer’s Disease Neuroimaging Initiative (ADNI), using filter methods and classification techniques. The results showed that classification models generated from the CFAS and ADNI sets of chosen neuropathological features were strong in terms of sensitivity, accuracy, and other measures when mined by different classification techniques. In the ADNI dataset results, the significant neuropathological features contributing to AD included neocortical neuritic plaques, Braak stage, Thal phase, diffuse plaques, and cerebral amyloid angiopathy (CAA), all of which showed a high correlation with AD’s diagnostic label. In the CFAS dataset, the results were consistent with those derived from the ADNI dataset. Moreover, among the filter methods considered, reliefF had the strongest correlation with feature-feature correlations in both ADNI and CFAS datasets, less sensitive to feature-feature correlations. However, no filter method had clear dominance over ADNI results. More essentially, the results indicated limited consistency in feature rankings between ADNI and CFAS. However, reliefF had the most agreement, while the Gain Ratio method had less consistency in ranking the features in both datasets. In summary, this thesis provided valuable insights into the application of filter methods and neuropathology data for developing classification models for dementia conditions’ diagnosis. The study demonstrated the significance of considering feature-feature correlations when selecting influential features and the impact of different filter methods on feature ranking and classification performance. These findings suggest that the proposed approach could effectively minimise the discrepancy of feature ranking and generate an impactful set of features for classification algorithms. These results had practical implications for pathologists in improving the understanding of AD pathology. Furthermore, the study has highlighted the potential for future research to leverage diverse filter methods to identify more reliable biomarkers and enhance the detection of dementia, particularly for AD.
    26 0
  • Thumbnail Image
    ItemRestricted
    Predicting Employee Attrition: A Machine Learning Approach with Interactive Dashboards
    (Saudi Digital Library, 2023-11-01) Alsulami, Faisal Sitr; Al-Den, Mohammed Bader
    Organisations increasingly use data-driven insights to guide their decision-making in the modern digital world. This study investigates employee attrition, a problem that affects businesses everywhere and can result in everything from higher hiring expenses to service interruptions. This study aimed to forecast staff attrition rates and give visual dashboards to help HR departments with strategic planning by utilising the "IBM HR Analytics Employee Attrition & Performance" dataset from Kaggle. The dataset was analysed using machine learning techniques, notably decision trees and random forests. With Tableau, a dynamic dashboard emphasising user interaction and interactivity was created. This dashboard displayed the outcomes of the prediction models and gave users access to information about the many aspects that affect employee attrition. Protecting individual rights and building faith in the analytical results were ensured by addressing legal, social, and ethical issues, particularly in data processing. According to the results, the random forest algorithm fared better in predicting staff attrition than decision trees. The interactive dashboard's capabilities, which include filters for age group, income comparison, job role, and department, improved the user experience and gave users a more detailed understanding of attrition trends. This study highlights the potential of data analytics in HR management by providing tools and insights that can significantly impact organisational strategy and decision-making.
    50 0

Copyright owned by the Saudi Digital Library (SDL) © 2025