Fair and Accurate Machine Learning in Dynamic and Multi-domain Settings

Thumbnail Image

Date

2024-05-01

Journal Title

Journal ISSN

Volume Title

Publisher

Rutgers, The State University of New Jersey

Abstract

A multitude of decision-making tasks, such as content moderation, medical diagnosis, misinformation detection, and recidivism prediction, are increasingly being automated due to recent developments in machine learning (ML). While ML models demonstrate superior capabilities in large-scale data processing and complex pattern recognition compared to humans, the decisions they make can profoundly impact individuals' opportunities and lives, necessitating the assurance of their accuracy and fairness. Besides developing ML models in controlled lab environments, automated machine learning tasks are often used in real-world settings where the concept of stationarity (i.e., the independent and identically distributed i.i.d. assumption) is frequently violated, leading to a notable decrease in the effectiveness of machine learning models. Specifically, real-world ML models can be trained on particular domains and deployed in dissimilar domains. These domains encompass diverse time points, heterogeneous population groups, or disparate tasks demanding careful, dynamic, ethical, and knowledge-transferring model development techniques. Due to the dynamic nature of many machine learning tasks and their continuous evolution, a previously trained model may become unfair or erroneous over time. Additionally, machine learning applications can be particularly challenging due to limited data or computational resources, which often require developers to leverage knowledge from other domains. In this dissertation, we investigate these issues and suggest ways to mitigate the challenges of maintaining the goals of fairness and accuracy in dynamic and multi-domain settings. Particularly, to mitigate the impact of the dynamic issue, we present a pair of anticipatory bias correction techniques that target fairness and accuracy simultaneously in temporally shifting and delayed labeling contexts, supporting the goals of timely and safe model adaptation. Furthermore, we leverage transfer learning methodology to study ML performance in developing a fair and accurate dermatological image processing task for skin cancer diagnosis using datasets gathered from various domains (i.e., locations) and models trained on different contexts (i.e., pre-trained image recognition model). Lastly, we explore the feasibility of combining diverse commercial pre-trained black-box models developed in various domains to jointly enhance fairness and accuracy for a sentiment analysis task. We present an overview of the observed results for each work, discuss the identified limitations, and propose future research directions. These results represent significant progress toward developing fair and accurate ML algorithms in dynamic and multi-domain settings.

Description

Keywords

Machine Learning, Artifical Intelligence, Fairness, Bias in Machine Learning, Distribution Shift

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2024