Auditing Machine Learning Models Beyond Predefined Groups: A Multi-Level Framework for Systematic Analysis of Performance Disparities

dc.contributor.advisorWojtusia, Janusz
dc.contributor.authorAljoudi, Salman Mohammed
dc.date.accessioned2026-04-28T07:17:07Z
dc.date.issued2026
dc.description.abstractThis dissertation develops and validates a multilevel framework for detecting disparities in machine learning models used to predict substance use disorder (SUD) treatment outcomes. Aim One develops the first component of the framework using permutation-based feature importance to create subgroups from top influential predictors. Results show that all three predictive models—logistic regression, random forest, and gradient boosting—exhibited substantial marginal performance differences across feature defined subgroups, particularly in recall and fairness related metrics, revealing that individual influential features can induce unequal model performance. Aim Two extends disparity detection by applying hierarchical clustering to the same influential features, uncovering latent subgroups defined by multivariate structure. Cluster-level analyses reveal heterogeneous performance patterns, including clusters with systematically lower recall, calibration inconsistencies, and elevated fairness metric deviations. Interaction-level evaluation shows that many disparities are context dependent and appear only within specific cluster–feature combinations. Aim Three statistically evaluates the stability and sources of these disparities across repeated model runs. Feature-level disparities are consistently significant but largely unaffected by subgroup size imbalance, while cluster-level disparities show selective detectability and, in some cases, dependence on representation-imbalance. Interaction-level tests isolate the most persistent, context stable disparities that remain significant after accounting for latent subgroup structure, demonstrating the necessity of multilayer evaluation. The dissertation concludes that a unified, multilevel disparity detection framework—spanning marginal, latent, and conditional subgroup definitions—is essential for identifying reliable and actionable performance gaps in healthcare predictive models. This approach provides a scalable and reproducible path toward more equitable machine learning based SUD treatment evaluation.
dc.format.extent260
dc.identifier.urihttps://hdl.handle.net/20.500.14154/78782
dc.language.isoen_US
dc.publisherSaudi Digital Library
dc.subjectDisparity detection
dc.subjectMultilevel evaluation framework
dc.subjectMachine learning fairness
dc.subjectPermutation‑based feature importance
dc.subjectFeature‑defined subgroups
dc.subjectLatent subgroup analysis
dc.subjectContext‑dependent performance gaps
dc.titleAuditing Machine Learning Models Beyond Predefined Groups: A Multi-Level Framework for Systematic Analysis of Performance Disparities
dc.typeThesis
sdl.degree.departmentHhealth Administration and Policy
sdl.degree.disciplineHealth Informatics
sdl.degree.grantorGeorge Mason University
sdl.degree.nameDoctor of Philosophy in HEalth Services Research

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
SACM-Dissertation.pdf
Size:
15.46 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed to upon submission
Description:

Copyright owned by the Saudi Digital Library (SDL) © 2026