Towards Cost-Effective Noise-Resilient Machine Learning Solutions

Thumbnail Image

Date

2026-06-04

Journal Title

Journal ISSN

Volume Title

Publisher

University of Georgia

Abstract

Machine learning models have demonstrated exceptional performance in various applications as a result of the emergence of large labeled datasets. Although there are many available datasets, acquiring high-quality labeled datasets is challenging since it involves huge human supervision or expert annotation, which are extremely labor-intensive and time-consuming. The problem is magnified by the considerable amount of label noise present in datasets from real-world scenarios, which significantly undermines the performance accuracy of machine learning models. Since noisy datasets can affect the performance of machine learning models, acquiring high-quality datasets without label noise becomes a critical problem. However, it is challenging to significantly decrease label noise in real-world datasets without hiring expensive expert annotators. Based on extensive testing and research, this dissertation examines the impact of different levels of label noise on the accuracy of machine learning models. It also investigates ways to cut labeling expenses without sacrificing required accuracy. Finally, to enhance the robustness of machine learning models and mitigate the pervasive issue of label noise, we present a novel, cost-effective approach called Self Enhanced Supervised Training (SEST).

Description

Keywords

Class Label Noise, Deep Learning, Ensemble Learning, Labeling, Cost Optimization, Machine Learning, Mislabeled Data

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2024