Xu, GuandongWang, XianzhiRazzak, ImranAlsuhaibani, Abdullah Mohammed2025-12-222025https://hdl.handle.net/20.500.14154/77609Collecting and classifying large-scale datasets typically requires extensive human annotation, making the process both time-consuming and expensive. Moreover, managing the vast amount of textual data from diverse sources, such as social media platforms, demands efficient architectures for scalable text classification. The introduction of the Transformer architecture in 2017 revolutionized natural language processing by establishing the paradigm of pre-trained models, such as BERT, which are initially trained on large corpora and adapted to downstream tasks through Fine-Tuning. This thesis addresses the core problem of fine-tuning pre-trained language models under limited label conditions. Specifically, it highlights three key challenges. Performance degradation with few labels, class imbalance in short text, and limited effectiveness of clustering-based methods in merging semantically meaningful groups. While prior solutions, such as few-shot and semi-supervised learning, attempt to mitigate these issues, they often rely on large datasets or complex architectures. This thesis aims to leverage Intermediate Training to enhance the fine-tuning of language models for downstream tasks under limited label scenarios for text classification. To address these limitations, we propose a novel model that employs dual clustering algorithms to enhance fine-tuning performance under limited-label conditions, using a small portion of unlabeled data for correction. We further introduce a two-stage framework to address class imbalance in few-label short text classification by leveraging contrastive learning as an indicator with generated representations. Finally, we present an intermediate training framework that preserves cluster quality while reducing cluster quantity, thereby achieving better alignment with the true class distributions of the datasets. The empirical results and findings presented in this thesis demonstrate the effectiveness of the proposed frameworks, which achieve better performance compared to both baseline and state-of-the-art models, with an accuracy improvement of at least 5.3%.162enText ClassificationFew LabelsFine-TuningLeveraging Intermediate Training for Effective Fine-Tuning in Few-Labels Text ClassificationThesis