PHONOCARDIOGRAM CLASSIFICATION USING MOTIF DISCOVERYY

Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The research work presented in this thesis is directed at diagnosing heart diseases using Phonocardiograms (plots of heartbeat sound recordings). The central motivation was to provide point of care diagnosis using machine learning software shipped with digital stethoscopes; what might then be referred to as intelligent digital stethoscopes. Thus the aim was to classify Phonocardiograms (PCGs) using time series analysis techniques, more specifically motif-based time series analysis. The main challenge was the size of the PCG time series to be considered. The main contributions of the thesis are four approaches: (i) the MK Benchmark to PCG classification, (ii) PCGseg Classification, (iii) SGR-FMD and (iv) CE-FCS approaches. The MK Benchmark approach investigated, for the first time, the application of motif discovery to PCG data. The fundamental rationale of this approach was to provide a vehicle with which to compare the alternative approaches presented in the thesis. The PCGseg Classification approach provided a novel bespoke segmentation technique, based on “shapes”, which served to significantly reduce the processing time (by a factor of more than eleven) compared with the Benchmark approach. The fundamental rationale underpinning the SGR-FMD approach was to prune the time series data by removing sub-sequences that were unlikely to be representative of any class in order to reduce the complexity of the motif discovery process. In more detail, the rationale was to remove the “silent gaps” from the PCG data. The SGR-FMD approach also featured a novel technique of clustering. The runtime was improved by a factor of more than 278 compared with the PCGseg Classification approach. The CE-FCS approach rationale was to generate meaningful motifs while at the same time reducing the number of computations. This was applied to the PCG recordings by extracting the heart cycles that represented potential motifs and by considering the statistical distribution of these motifs. This approach produced the best results of the four approaches proposed in this thesis; its accuracy was the highest recorded and the application runtime was the least. The evaluation data set used was a canine PCG data set obtained from the School of Veterinary Science, Small Animal Teaching Hospital, the University of Liverpool, who collaborated on the research. This data set featured four classes, and was labelled by domain experts.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2025