Addressing the Cold-Start Problem in Active Learning for Improved model Performance

dc.contributor.advisorHerrera, Luis Carlos
dc.contributor.authorAlshamrani, Rahaf
dc.date.accessioned2024-12-15T09:27:52Z
dc.date.issued2024-08
dc.description.abstractBy providing an effective approach that improves model performance through optimized information selection, this study aims to address the cold-start issue in active learning. In the context of active learning, the cold-start problem—where models have minimal labeled data to start training with this particularly challenging. To maximize annotation efficiency and enhance overall model performance, we propose to train a model to determine which subset of unlabeled data points is the most informative for annotation. Our objective is to reduce the human effort needed for annotation while ensuring the model receives the most effective training data by carefully choosing these first data points from a significant number of unlabeled samples. The aim of this project is to enhance the performance of machine learning models, minimize the load associated with human annotation, and provide an approach for choosing informative instances. Through comprehensive experimentation and analysis, we demonstrate that (TypiClust) significantly enhances model accuracy and robustness. We compare the proposed approach with the random sampling approach and find that TypiClust has better performance and provides a valuable framework to address the cold-start issue in various active learning applications.
dc.format.extent57
dc.identifier.urihttps://hdl.handle.net/20.500.14154/74185
dc.language.isoen
dc.publisherKing's College London
dc.subjectActive Learning
dc.subjectCold-Start problem
dc.subjectTypiClust
dc.subjectRandom Sampling
dc.subjectLabeled
dc.subjectUnlabeled
dc.titleAddressing the Cold-Start Problem in Active Learning for Improved model Performance
dc.typeThesis
sdl.degree.departmentDepartment of Informatics
sdl.degree.disciplineArtificial Intelligence
sdl.degree.grantorKing's College London
sdl.degree.nameMaster of Science

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
SACM-Dissertation.pdf
Size:
761.21 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed to upon submission
Description:

Copyright owned by the Saudi Digital Library (SDL) © 2025