Fair and Accurate Machine Learning in Dynamic and Multi-domain Settings
dc.contributor.advisor | Singh, Vivek | |
dc.contributor.advisor | Pennock, David | |
dc.contributor.author | Almuzaini, Abdulaziz | |
dc.date.accessioned | 2024-05-07T09:19:43Z | |
dc.date.available | 2024-05-07T09:19:43Z | |
dc.date.issued | 2024-05-01 | |
dc.description.abstract | A multitude of decision-making tasks, such as content moderation, medical diagnosis, misinformation detection, and recidivism prediction, are increasingly being automated due to recent developments in machine learning (ML). While ML models demonstrate superior capabilities in large-scale data processing and complex pattern recognition compared to humans, the decisions they make can profoundly impact individuals' opportunities and lives, necessitating the assurance of their accuracy and fairness. Besides developing ML models in controlled lab environments, automated machine learning tasks are often used in real-world settings where the concept of stationarity (i.e., the independent and identically distributed i.i.d. assumption) is frequently violated, leading to a notable decrease in the effectiveness of machine learning models. Specifically, real-world ML models can be trained on particular domains and deployed in dissimilar domains. These domains encompass diverse time points, heterogeneous population groups, or disparate tasks demanding careful, dynamic, ethical, and knowledge-transferring model development techniques. Due to the dynamic nature of many machine learning tasks and their continuous evolution, a previously trained model may become unfair or erroneous over time. Additionally, machine learning applications can be particularly challenging due to limited data or computational resources, which often require developers to leverage knowledge from other domains. In this dissertation, we investigate these issues and suggest ways to mitigate the challenges of maintaining the goals of fairness and accuracy in dynamic and multi-domain settings. Particularly, to mitigate the impact of the dynamic issue, we present a pair of anticipatory bias correction techniques that target fairness and accuracy simultaneously in temporally shifting and delayed labeling contexts, supporting the goals of timely and safe model adaptation. Furthermore, we leverage transfer learning methodology to study ML performance in developing a fair and accurate dermatological image processing task for skin cancer diagnosis using datasets gathered from various domains (i.e., locations) and models trained on different contexts (i.e., pre-trained image recognition model). Lastly, we explore the feasibility of combining diverse commercial pre-trained black-box models developed in various domains to jointly enhance fairness and accuracy for a sentiment analysis task. We present an overview of the observed results for each work, discuss the identified limitations, and propose future research directions. These results represent significant progress toward developing fair and accurate ML algorithms in dynamic and multi-domain settings. | |
dc.format.extent | 139 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14154/71947 | |
dc.language.iso | en_US | |
dc.publisher | Rutgers, The State University of New Jersey | |
dc.subject | Machine Learning | |
dc.subject | Artifical Intelligence | |
dc.subject | Fairness | |
dc.subject | Bias in Machine Learning | |
dc.subject | Distribution Shift | |
dc.title | Fair and Accurate Machine Learning in Dynamic and Multi-domain Settings | |
dc.type | Thesis | |
sdl.degree.department | Computer Science | |
sdl.degree.discipline | Ethical Artifical Intelligence | |
sdl.degree.grantor | Rutgers, The State University of New Jersey | |
sdl.degree.name | Doctor of Philosophy |