AI for Fraud Detection

Albaqami, Abdullah

AI for Fraud Detection

dc.contributor.advisor	Muneeb, Ahmad
dc.contributor.author	Albaqami, Abdullah
dc.date.accessioned	2026-03-17T22:58:30Z
dc.date.issued	2026
dc.description.abstract	Financial fraud is rapidly growing in the digital payment systems, and it highlights the shortcomings of the fixed rule-based controls and machine learning models that work well in the testing environment but fail miserably in the real-life operational environment. This study develops and evaluates a complete fraud detection pipeline designed to address three persistent challenges: severe class imbalance, model instability under shifting data distributions, and the need for transparent decision outputs required by regulators and financial institutions. The pipeline integrates systematic data preprocessing, an optimized LightGBM model, and SHAP-based interpretability using the IEEE-CIS dataset of 590,540 transactions. The methodology includes memory optimization, structured missing-value treatment, outlier handling through winsorization, label encoding for high-cardinality categorical fields, temporal feature engineering, and correlation-based feature reduction. Optuna is a Bayesian optimisation that is used to optimise LightGBM hyper-parameters using ROC-AUC as the objective function. ROC-AUC, PR-AUC, precision, recall, F1-score, and a confusion matrix are used to measure model performance, thus, following the best practices in imbalanced classification. SHAP analysis is used to produce both global and local explanations of model behaviour. The final model achieves strong discriminative performance, with a ROC-AUC of 0.9606 and a PR-AUC of 0.8042. The accuracy (0.7335) and recall (0.7491) indicate balanced detection and the confusion matrix shows that there is good fraud detection with controllable false-positives. SHAP analysis shows that count based features, transaction amount, card identifiers, geographic features, and temporal patterns are the predictive features, which are consistent with the established fraud behaviours reported in the recent literature. The results demonstrate that the improvement in performance is not only due to the choice of the model but also to the mutual complementary effect of data engineering, hyper-parameter optimization, and interpretability. The researchers conclude that an end-to-end pipeline improves the accuracy of detection, increases transparency, and overcomes fundamental limitations that were found in previous studies of fraud. Limitations are anonymisation of the datasets, lack of drift analysis, and possible loss of fraud indicators in the course of preprocessing.
dc.format.extent	58
dc.identifier.citation	APA
dc.identifier.uri	https://hdl.handle.net/20.500.14154/78488
dc.language.iso	en
dc.publisher	Saudi Digital Library
dc.subject	AI
dc.subject	Fruad
dc.subject	financial fraud detection
dc.subject	Lightgbm
dc.subject	Optimization
dc.subject	Ieee
dc.title	AI for Fraud Detection
dc.type	Thesis
sdl.degree.department	Department of Computer Science
sdl.degree.discipline	Data scince
sdl.degree.grantor	Swansea University
sdl.degree.name	Master

Files

Original bundle

Now showing 1 - 1 of 1

Name:: SACM-Dissertation.pdf
Size:: 1.06 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.61 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

SACM - United Kingdom