AI for Fraud Detection

dc.contributor.advisorMuneeb, Ahmad
dc.contributor.authorAlbaqami, Abdullah
dc.date.accessioned2026-03-17T22:58:30Z
dc.date.issued2026
dc.description.abstractFinancial fraud is rapidly growing in the digital payment systems, and it highlights the shortcomings of the fixed rule-based controls and machine learning models that work well in the testing environment but fail miserably in the real-life operational environment. This study develops and evaluates a complete fraud detection pipeline designed to address three persistent challenges: severe class imbalance, model instability under shifting data distributions, and the need for transparent decision outputs required by regulators and financial institutions. The pipeline integrates systematic data preprocessing, an optimized LightGBM model, and SHAP-based interpretability using the IEEE-CIS dataset of 590,540 transactions. The methodology includes memory optimization, structured missing-value treatment, outlier handling through winsorization, label encoding for high-cardinality categorical fields, temporal feature engineering, and correlation-based feature reduction. Optuna is a Bayesian optimisation that is used to optimise LightGBM hyper-parameters using ROC-AUC as the objective function. ROC-AUC, PR-AUC, precision, recall, F1-score, and a confusion matrix are used to measure model performance, thus, following the best practices in imbalanced classification. SHAP analysis is used to produce both global and local explanations of model behaviour. The final model achieves strong discriminative performance, with a ROC-AUC of 0.9606 and a PR-AUC of 0.8042. The accuracy (0.7335) and recall (0.7491) indicate balanced detection and the confusion matrix shows that there is good fraud detection with controllable false-positives. SHAP analysis shows that count based features, transaction amount, card identifiers, geographic features, and temporal patterns are the predictive features, which are consistent with the established fraud behaviours reported in the recent literature. The results demonstrate that the improvement in performance is not only due to the choice of the model but also to the mutual complementary effect of data engineering, hyper-parameter optimization, and interpretability. The researchers conclude that an end-to-end pipeline improves the accuracy of detection, increases transparency, and overcomes fundamental limitations that were found in previous studies of fraud. Limitations are anonymisation of the datasets, lack of drift analysis, and possible loss of fraud indicators in the course of preprocessing.
dc.format.extent58
dc.identifier.citationAPA
dc.identifier.urihttps://hdl.handle.net/20.500.14154/78488
dc.language.isoen
dc.publisherSaudi Digital Library
dc.subjectAI
dc.subjectFruad
dc.subjectfinancial fraud detection
dc.subjectLightgbm
dc.subjectOptimization
dc.subjectIeee
dc.titleAI for Fraud Detection
dc.typeThesis
sdl.degree.departmentDepartment of Computer Science
sdl.degree.disciplineData scince
sdl.degree.grantorSwansea University
sdl.degree.nameMaster

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
SACM-Dissertation.pdf
Size:
1.06 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed to upon submission
Description:

Copyright owned by the Saudi Digital Library (SDL) © 2026