Do, AnhAlqarzaai, Abdulmohsen Abdullah2026-01-132025Alqarzaai, A.A. (2025). PREDICTING CREDIT CARD DEFAULT RISK USING MACHINE LEARNING: A COMPARATIVE STUDY OF TREE-BASED MODELS ON BANK CUSTOMER DATA (Msc dissertation, Swansea University). Swansea Universityhttps://hdl.handle.net/20.500.14154/77859This study examined whether modern tree-based models can predict credit-card default better than the usual logistic model while keeping decisions clear, fair, and linked to profit. The background is rising delinquency and charge-offs, which increases the value of accurate and transparent tools. Prior studies rarely tested trees with time-split validation, profit-based cut-offs, and fairness checks in one design, so four hypotheses were set: boosted trees beat logistic out of time; imbalance methods raise minority recall with small loss in AUC; profit-tuned thresholds improve expected profit under risk limits; SHAP explanations make drivers and group outcomes easy to see. Two public datasets were used, the Taiwan file (~30k rows) and Home Credit (~307k with joined tables). The data were cleaned, leakage was avoided, logistic regression, decision trees, Random Forest, XGBoost, LightGBM, and CatBoost were compared, under-sampling, SMOTE, and focal loss were tested, forward-in-time splits were used for Home Credit, and thresholds were selected by expected profit with a guardrail on approved-default. Boosted trees led on Home Credit with AUC near 0.75, while on Taiwan the logistic baseline stayed competitive. A profit cut-off around 0.21 increased expected profit by over 4.9 billion while staying within the guardrail. Imbalance methods gave modest recall gains. SHAP showed external scores, affordability and timing as top drivers, with small approval gaps by gender. The findings imply trees are preferred for complex data, profit and risk should guide thresholds, and time-split testing matters. Recommendations include boosted trees with profit guardrails, simple fairness monitors, forward-in-time validation, and future work using richer data and stress tests too.66enFinanceXGBoostLightGBMData AnalysisPredicting Credit Card Default Risk Using Machine Learning: A Comparative Study of Tree-Based Models on Bank Customer DataThesis