SACM - United Kingdom

Permanent URI for this collectionhttps://drepo.sdl.edu.sa/handle/20.500.14154/9667

Browse

Search Results

Now showing 1 - 10 of 25

Restricted
Human Action Recognition Based on Convolutional Neural Networks and Vision Transformers
(University of Southampton, 2025-05) Alomar, Khaled Abdulaziz; Xiaohao, Cai
This thesis explores the impact of deep learning on human action recognition (HAR), addressing challenges in feature extraction and model optimization through three interconnected studies. The second chapter surveys data augmentation techniques in classification and segmentation, emphasizing their role in improving HAR by mitigating dataset limitations and class imbalance. The third chapter introduces TransNet, a transfer learning-based model, and its enhanced version, TransNet+, which utilizes autoencoders for improved feature extraction, demonstrating superior performance over existing models. The fourth chapter reviews CNNs, RNNs, and Vision Transformers, proposing a novel CNN-ViT hybrid model and comparing its effectiveness against state-of-the-art HAR methods, while also discussing future research directions.
23 0
Restricted
Rasm: Arabic Handwritten Character Recognition: A Data Quality Approach
(University of Essex, 2024) Alghamdi, Tawfeeq; Doctor, Faiyaz
The problem of AHCR is a challenging one due to the complexities of the Arabic script, and the variability in handwriting (especially for children). In this context, we present ‘Rasm’, a data quality approach that can significantly improve the result of AHCR problem, through a combination of preprocessing, augmentation, and filtering techniques. We use the Hijja dataset, which consists of samples from children from age 7 to age 12, and by applying advanced preprocessing steps and label-specific targeted augmentation, we achieve a significant improvement of a CNN performance from 85% to 96%. The key contribution of this work is to shed light on the importance of data quality for handwriting recognition. Despite the recent advances in deep learning, our result reveals the critical role of data quality in this task. The data-centric approach proposed in this work can be useful for other recognition tasks, and other languages in the future. We believe that this work has an important implication on improving AHCR systems for an educational context, where the variability in handwriting is high. Future work can extend the proposed techniques to other scripts and recognition tasks, to further improve the optical character recognition field.
50 0
Restricted
Enhance Deep Learning for Cybersecurity Challenges in Software-Defined Networks
(University of Warwick, 2024-09) Alsaadi, Sami; Leeson, Mark and Lakshminarayana, Subhash
Traditional network devices, such as a router or switch, incorporate the control plane and the data plane. IT operators independently set traffic policies on each device. Nonetheless, this architectural setup raises operational expenses and complicates the dynamic adaptation and maintenance of secure network configurations. Software-defined Networking (SDN) represents a revolutionary approach to network management, offering enhanced flexibility. SDN promotes rapid innovation in networking by centralizing control and making it programmable. However, security concerns pose significant barriers to the broader adoption of SDN, as this new architecture potentially opens novel attack vectors previously non-existent or more challenging to exploit. Machine Learning (ML) strategies for SDN security rely heavily on feature engineering, requiring expert knowledge and causing delays. Therefore, enhancing intrusion detection is essential for protecting SDN architectures against diverse threats. The thesis develops techniques for detecting malicious activities in SDN using Deep Learning DL. It starts by evaluating CNNs on an SDN dataset, leading to a new CNN-based detection approach that employs a novel regularization method to reduce kernel weights and address overfitting, improving effectiveness against unrecognized attacks. Additionally, a semi-supervised learning method using an LSTM autoencoder combined with One Class SVM is introduced, specifically designed to detect DDoS attacks. This approach enhances the detection capabilities within SDN environments, showcasing the potential of DL in advancing network security.
20 0
Restricted
Enhancing Breast Cancer Diagnosis with ResNet50 Models: A Comparative Study of Dropout Regularization and Early Stopping Techniques
(University of Exeter, 2024-09-20) Basager, Raghed Tariq Ahmed; Kelson, Mark; Rowland, Sareh
Early detection and treatment of breast cancer depend on accurate image analysis. Deep learning models, particularly Convolutional Neural Networks (CNNs), have proven highly effective in automating this critical diagnostic process. While prior studies have explored CNN architectures [1, 2], there is a growing need to understand the role of dropout regularization and fine-tuning strategies in optimizing these models. This research seeks to improve breast cancer diagnosis by evaluating ResNet50 models trained from scratch and fine-tuned, with and without dropout regularization, using both original and augmented datasets. Assumptions and Limitations: This research assumes that the Kaggle Histopathologic Cancer Detection dataset is representative of real-world clinical images. Limitations include dataset diversity and computational resources, which may affect generalization to broader clinical applications. ResNet50 models were trained on the Kaggle Histopathologic Cancer Detection dataset with various configurations of dropout, early stopping, and data augmentation [3–6]. Performance was assessed using accuracy, precision, recall, F1-score, and AUC-ROC metrics [7, 8]. The best-performing model was a ResNet50 trained from scratch without dropout regularization, achieving a validation accuracy of 97.19%, precision of 96.20%, recall of 96.90%, F1-score of 96.55%, and an AUC-ROC of 0.97. Grad-CAM visualizations offered insights into the model’s decision-making process, enhancing interpretability crucial for clinical use [9,10]. Misclassification analysis showed that data augmentation notably improved classification accuracy, particularly by correcting previously misclassified images [11]. These findings highlight that training ResNet50 without dropout, combined with data augmentation, significantly enhances diagnostic accuracy from histopathological images. Original Contributions: This research offers novel insights by demonstrating that a ResNet50 model without dropout regularization, trained from scratch and with advanced data augmentation techniques, can achieve high diagnostic accuracy and interpretability, paving the way for more reliable AI-powered diagnostics.
17 0
Open Access
Parking Occupancy Classification: Deep learning model compression for edge device classification
(Queen Mary University of London, 2024) Tamim, Ziad; Ansari, Tayyab Ahmed
Urban areas face severe traffic congestion due to poorly managed parking systems. Advanced parking management, like automated and smart parking guidance systems, offers a feasible solution requiring real-tim occupancy data. Traditional sensor-based methods are costly and inefficient for large scale parking, whereas video-based sensing is more effective. Deep learning methods improve accuracy but have high computational costs, affecting real-time performance. Central servers or cloud platforms are often used but can be impractical due to high resource demands. Instead, utilising edge devices with model compression techniques—such as quantisation and knowledge distillation enhances efficiency. This study aims to boost the inference speed of parking classification algorithms by developing a custom model called QCustom based on the MobileNet Depthwise Separable Convolution blocks and using compression techniques to reduce the inference time further. Contributions include developing an edge-based real-time occupancy system, setting performance benchmarks, optimising models for edge devices, and testing on a prototype parking lot. The goal is efficient and accurate parking management for smart cities. Results of the paper include accuracy of 98.8% on the CNRPark-EXT dataset, real world implementation accuracy of 97.44%, and an inference speed for one parking slot of 0.746ms on the Raspberry Pi 5.
44 0
Restricted
LIGHTREFINENET-SFMLEARNER: SEMI-SUPERVISED VISUAL DEPTH, EGO-MOTION AND SEMANTIC MAPPING
(Newcastle University, 2024) Alshadadi, Abdullah Turki; Holder, Chris
The advancement of autonomous vehicles has garnered significant attention, particularly in the development of complex software stacks that enable navigation, decision-making, and planning. Among these, the Perception [1] component is critical, allowing vehicles to understand their surroundings and maintain localisation. Simultaneous Localisation and Mapping (SLAM) plays a key role by enabling vehicles to map unknown environments while tracking their positions. Historically, SLAM has relied on heuristic techniques, but with the advent of the "Perception Age," [2] research has shifted towards more robust, high-level environmental awareness driven by advancements in computer vision and deep learning. In this context, MLRefineNet [3] has demonstrated superior robustness and faster convergence in supervised learning tasks. However, despite its improvements, MLRefineNet struggled to fully converge within 200 epochs when integrated into SfmLearner. Nevertheless, clear improvements were observed with each epoch, indicating its potential for enhancing performance. SfmLearner [4] is a state-of-the-art deep learning model for visual odometry, known for its competitive depth and pose estimation. However, it lacks high-level understanding of the environment, which is essential for comprehensive perception in autonomous systems. This paper addresses this limitation by introducing a multi-modal shared encoder-decoder architecture that integrates both semantic segmentation and depth estimation. The inclusion of high-level environmental understanding not only enhances scene interpretation—such as identifying roads, vehicles, and pedestrians—but also improves the depth estimation of SfmLearner. This multi-task learning approach strengthens the model’s overall robustness, marking a significant step forward in the development of autonomous vehicle perception systems.
43 0
Restricted
A Peer-to-Peer Federated Learning Framework for Intrusion Detection in Autonomous Vehicles
(Lancaster University, 2024-09) Alotaibi, Bassam; Bradbury, Matthew
As autonomous vehicles (AVs) increasingly rely on interconnected systems for enhanced functionality, they also face heightened cyberattack vulnerability. This study introduces a decentralized peer-to-peer federated learning framework to improve intrusion detection in AV environments while preserving data privacy. A novel soft-reordering one-dimensional Convolutional Neural Network (SR-1CNN) is proposed as the detection engine, capable of identifying known and unknown threats with high accuracy. The framework allows vehicles to communicate directly in a mesh topology, sharing model parameters asynchronously, thus eliminating dependency on centralized servers and mitigating single points of failure. The SR-1CNN model was tested on two datasets: NSL-KDD and Car Hacking, under both independent and non-independent data distribution scenarios. The results demonstrate the model’s robustness, achieving detection accuracies of 94.39% on the NSL-KDD dataset and 99.97% on the Car Hacking dataset in independent settings while maintaining strong performance in non-independent configurations. These findings underline the framework’s potential to enhance cybersecurity in AV networks by addressing data heterogeneity and preserving user privacy. This research contributes to the field of AV security by offering a scalable, privacy-conscious intrusion detection solution. Future work will focus on optimizing the SR-1CNN architecture, exploring vertical federated learning approaches, and validating the framework in real-world autonomous vehicle environments to ensure its practical applicability and scalability.
49 0
Restricted
Enhancing a Hyper-parameter Tuning of Convolutional Neural Network Model for Brain Tumor Classification using Whale Optimization and Grey Wolf Optimizer
(Newcastle University, 2024) Alkhudair, Haifa; Freitas, Leo
Brain tumors represent a global health issue, with about 11 new cases per 100,000 people annually. Therefore, it is crucial to develop faster and more accurate diagnostic solutions. This study develops and evaluates a convolutional neural network (CNN) model optimized using the Whale Optimization Algorithm (WOA) and Grey Wolf Optimizer (GWO) for classifying brain tumors. To achieve that, this work involved collecting and preprocessing an MRI brain tumor dataset, followed by building and training CNN models. Hyperparameters were optimized using WOA and GWO, and the performance of these optimized mod- els was compared against a non-optimized CNN. The WOA-optimized CNN outperformed both the non-optimized and GWO-optimized mod- els, achieving an accuracy of 93.4% and demonstrating superior general- ization across different classes. This study underscores the effectiveness of WOA in enhancing CNN models for medical image classification, of- fering promising approaches to enhancing the accuracy and reliability of brain tumor classification
28 0
Restricted
Generative AI for Mitosis Synthesis in Histopathology Images
(University of Surrey, 2024-09) Alkhadra, Rahaf; Rai, Taran; Wells, Kevin
Identifying mitotic figures has been established as an effective method of fighting cancer at its most vulnerable stage. Traditional methods rely on manual, slow, and invasive detection methods obtained from sectioned tissue samples to acquire histopathological images. Currently, Artificial Intelligence (AI) in oncology has produced a paradigm shift in the fight against cancer, also known as computational oncology. This is heavily reliant on the availability of mitotic figure datasets to train models; however, such datasets are limited in size, type, and may infringe on patient privacy. It is hypothesised that the potential of computational oncology can be realised by synthesising realistic and diverse histopathological datasets using Generative Artificial Intelligence (GenAI). This report demonstrates a comparison of Denoising Probabilistic Diffusion Models (DDPM) and StyleGAN3 in generating synthetic histopathology images, with mitotic figures. The MIDOG++ dataset containing human and canine samples with 7 types of tumours was used to train the models. Quality and similarity of generated and real images was evaluated using as Frechet Inception Distance (FID), Mean Square Error (MSE), Structural Similarity Index (SSIM), and Area Under the Curve (AUC) as a part of Receiver Operating Characteristic (ROC) study were incorporated. Our results suggests that the DDPM model is superior in terms of structural accuracy, however, StyleGAN3 capture the colour scheme better.
31 0
Restricted
Automated Pain Assessment Through Facial Expression Using Deep Learning and Image Processing
(University of Reading, 2024-09-13) Alsama, Morady; Patino, Luis
As pain is an unavoidable part of life, this study examines the use of facial expression tech nology in assisting individuals with pain. Accurate pain assessment in health care is essential, especially for non-verbal patients, since conventional methods largely fail because of the in herent subjectivity and self-reporting. Therefore, the present study develops and evaluates an automated pain assessment system through advanced analysis of facial expressions driven by contemporary deep learning techniques. It aims to generate a reliable and unbiased system for detecting and classifying pain intensity. A CNN-based system was developed using base models that apply ResNet-18 and ResNext-50 architectures. A custom-designed final layer was added to optimize classification accuracy, tailored explicitly for pain detection. Comprehensive data preprocessing strategies were used in the model to make it robust; it involved downsam pling and augmentation of the data. It was trained and validated on the UNBC-McMaster Shoulder Pain Expression Archive Database and the Radboud Faces Database, showing an impressive accuracy of over 90% on the training data. However, generalizing the models to unseen validation and test data proved challenging. These findings further articulate the crit ical imperative of enhancing generalisability across diverse patient populations for the system to perform effectively in real-world settings. The results underline the huge potential for deep learning in the automation of pain assessment, while future research remains on better mod eling generalization, promoting integration in clinical settings for a more objective, reliable, and consistent approach to pain management in health care settings.
16 0

SACM - United Kingdom

Browse

Filters

Settings

Sort By

Results per page

Search Results