Auditing Machine Learning Models Beyond Predefined Groups: A Multi-Level Framework for Systematic Analysis of Performance Disparities

Aljoudi, Salman Mohammed

Auditing Machine Learning Models Beyond Predefined Groups: A Multi-Level Framework for Systematic Analysis of Performance Disparities

dc.contributor.advisor	Wojtusia, Janusz
dc.contributor.author	Aljoudi, Salman Mohammed
dc.date.accessioned	2026-04-28T07:17:07Z
dc.date.issued	2026
dc.description.abstract	This dissertation develops and validates a multilevel framework for detecting disparities in machine learning models used to predict substance use disorder (SUD) treatment outcomes. Aim One develops the first component of the framework using permutation-based feature importance to create subgroups from top influential predictors. Results show that all three predictive models—logistic regression, random forest, and gradient boosting—exhibited substantial marginal performance differences across feature defined subgroups, particularly in recall and fairness related metrics, revealing that individual influential features can induce unequal model performance. Aim Two extends disparity detection by applying hierarchical clustering to the same influential features, uncovering latent subgroups defined by multivariate structure. Cluster-level analyses reveal heterogeneous performance patterns, including clusters with systematically lower recall, calibration inconsistencies, and elevated fairness metric deviations. Interaction-level evaluation shows that many disparities are context dependent and appear only within specific cluster–feature combinations. Aim Three statistically evaluates the stability and sources of these disparities across repeated model runs. Feature-level disparities are consistently significant but largely unaffected by subgroup size imbalance, while cluster-level disparities show selective detectability and, in some cases, dependence on representation-imbalance. Interaction-level tests isolate the most persistent, context stable disparities that remain significant after accounting for latent subgroup structure, demonstrating the necessity of multilayer evaluation. The dissertation concludes that a unified, multilevel disparity detection framework—spanning marginal, latent, and conditional subgroup definitions—is essential for identifying reliable and actionable performance gaps in healthcare predictive models. This approach provides a scalable and reproducible path toward more equitable machine learning based SUD treatment evaluation.
dc.format.extent	260
dc.identifier.uri	https://hdl.handle.net/20.500.14154/78782
dc.language.iso	en_US
dc.publisher	Saudi Digital Library
dc.subject	Disparity detection
dc.subject	Multilevel evaluation framework
dc.subject	Machine learning fairness
dc.subject	Permutation‑based feature importance
dc.subject	Feature‑defined subgroups
dc.subject	Latent subgroup analysis
dc.subject	Context‑dependent performance gaps
dc.title	Auditing Machine Learning Models Beyond Predefined Groups: A Multi-Level Framework for Systematic Analysis of Performance Disparities
dc.type	Thesis
sdl.degree.department	Hhealth Administration and Policy
sdl.degree.discipline	Health Informatics
sdl.degree.grantor	George Mason University
sdl.degree.name	Doctor of Philosophy in HEalth Services Research

Files

Original bundle

Now showing 1 - 1 of 1

Name:: SACM-Dissertation.pdf
Size:: 15.46 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.61 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

SACM - United States of America