An Exploration of Methodologies to Improve Semi-supervised Hierarchical Clustering with Knowledge-Based -Constraints

dc.contributor.advisorProf. Eran Edirisinghe, Dr. Christian Dawson and Dr. Daphne Teck Ching Lai
dc.contributor.authorABEER AHMED HAMAD ALJOHANI
dc.date2019
dc.date.accessioned2022-05-28T18:51:45Z
dc.date.available2022-05-28T18:51:45Z
dc.degree.departmentComputer sciences
dc.degree.grantorloughborough university
dc.description.abstractClustering algorithms with constraints (also known as semi-supervised clustering algorithms) have been introduced to the field of machine learning as a significant variant to the conventional unsupervised clustering learning algorithms. They have been demonstrated to achieve better performance due to integrating prior knowledge during the clustering process, that enables uncovering relevant useful information from the data being clustered. However, the research conducted within the context of developing semi-supervised hierarchical clustering techniques are still an open and active investigation area. Majority of current semi-supervised clustering algorithms are developed as partitional clustering (PC) methods and only few research efforts have been made on developing semi-supervised hierarchical clustering methods. The aim of this research is to enhance hierarchical clustering (HC) algorithms based on prior knowledge, by adopting novel methodologies. Such prior knowledge is translated into triple-wise relative constraints, which can effectively be applied in hierarchical clustering. The research presented in this thesis contributes to: the proposal of a novel clustering algorithm taking into account six agglomerative linkage measures, with triple-wise relative constraints and the critical investigation of the performance of the algorithm with the use of various parameters integrating distance metrics, linkage methods and different levels of constraints; Enhancing the effectiveness of Constrained Ward’s Hierarchical Agglomerative Clustering (CWHAC) algorithm by addressing the issues of constraint violation and redundancy and its efficiency by reducing the timeconsuming process of generating constraints; development of a novel hybrid clustering approach for Constrained Ward's Hierarchical algorithm underpinned by the intelligent k-Means clustering algorithm (CWHC-IKM) for cluster initialization; to address the challenges of typical agglomerative clustering approaches; developing a novel framework to handle noise or irrelevant features named as, Constrained Weighted Ward Hierarchical Clustering algorithm based on intelligent K-means algorithm (CWWHCIKM), which is designed to combine feature weighting approach with semi-supervised clustering. The thesis presents a rigorous performance analysis of the proposed novel Semi-Supervised Hierarchical Clustering (ssHC) algorithms proving their superiority in data clustering.
dc.identifier.urihttps://drepo.sdl.edu.sa/handle/20.500.14154/39160
dc.language.isoen
dc.titleAn Exploration of Methodologies to Improve Semi-supervised Hierarchical Clustering with Knowledge-Based -Constraints
sdl.thesis.levelDoctoral
sdl.thesis.sourceSACM - United Kingdom

Files

Copyright owned by the Saudi Digital Library (SDL) © 2025