Fair and Accurate Machine Learning in Dynamic and Multi-domain Settings

dc.contributor.advisorSingh, Vivek
dc.contributor.advisorPennock, David
dc.contributor.authorAlmuzaini, Abdulaziz
dc.date.accessioned2024-05-07T09:19:43Z
dc.date.available2024-05-07T09:19:43Z
dc.date.issued2024-05-01
dc.description.abstractA multitude of decision-making tasks, such as content moderation, medical diagnosis, misinformation detection, and recidivism prediction, are increasingly being automated due to recent developments in machine learning (ML). While ML models demonstrate superior capabilities in large-scale data processing and complex pattern recognition compared to humans, the decisions they make can profoundly impact individuals' opportunities and lives, necessitating the assurance of their accuracy and fairness. Besides developing ML models in controlled lab environments, automated machine learning tasks are often used in real-world settings where the concept of stationarity (i.e., the independent and identically distributed i.i.d. assumption) is frequently violated, leading to a notable decrease in the effectiveness of machine learning models. Specifically, real-world ML models can be trained on particular domains and deployed in dissimilar domains. These domains encompass diverse time points, heterogeneous population groups, or disparate tasks demanding careful, dynamic, ethical, and knowledge-transferring model development techniques. Due to the dynamic nature of many machine learning tasks and their continuous evolution, a previously trained model may become unfair or erroneous over time. Additionally, machine learning applications can be particularly challenging due to limited data or computational resources, which often require developers to leverage knowledge from other domains. In this dissertation, we investigate these issues and suggest ways to mitigate the challenges of maintaining the goals of fairness and accuracy in dynamic and multi-domain settings. Particularly, to mitigate the impact of the dynamic issue, we present a pair of anticipatory bias correction techniques that target fairness and accuracy simultaneously in temporally shifting and delayed labeling contexts, supporting the goals of timely and safe model adaptation. Furthermore, we leverage transfer learning methodology to study ML performance in developing a fair and accurate dermatological image processing task for skin cancer diagnosis using datasets gathered from various domains (i.e., locations) and models trained on different contexts (i.e., pre-trained image recognition model). Lastly, we explore the feasibility of combining diverse commercial pre-trained black-box models developed in various domains to jointly enhance fairness and accuracy for a sentiment analysis task. We present an overview of the observed results for each work, discuss the identified limitations, and propose future research directions. These results represent significant progress toward developing fair and accurate ML algorithms in dynamic and multi-domain settings.
dc.format.extent139
dc.identifier.urihttps://hdl.handle.net/20.500.14154/71947
dc.language.isoen_US
dc.publisherRutgers, The State University of New Jersey
dc.subjectMachine Learning
dc.subjectArtifical Intelligence
dc.subjectFairness
dc.subjectBias in Machine Learning
dc.subjectDistribution Shift
dc.titleFair and Accurate Machine Learning in Dynamic and Multi-domain Settings
dc.typeThesis
sdl.degree.departmentComputer Science
sdl.degree.disciplineEthical Artifical Intelligence
sdl.degree.grantorRutgers, The State University of New Jersey
sdl.degree.nameDoctor of Philosophy

Files

Copyright owned by the Saudi Digital Library (SDL) © 2024