Multi-Class Part Parsing based on Deep Learning
dc.contributor.advisor | Wu, Jing | |
dc.contributor.advisor | Lai, Yu-Kun | |
dc.contributor.advisor | Ji, Ze | |
dc.contributor.author | Alsudays, Njuod | |
dc.date.accessioned | 2025-03-11T08:27:58Z | |
dc.date.issued | 2024 | |
dc.description.abstract | Multi-class part parsing is a dense prediction task that seeks to simultaneously detect multiple objects and the semantic parts within these objects in the scene. This problem is important in providing detailed object understanding but is challenging due to the existence of both class-level and part-level ambiguities. This thesis investigates recent advancements in deep learning to tackle the challenges in multi-class part parsing. First, the AFPSNet network is proposed, which integrates scaled attention and feature fusion to tackle part-level ambiguity and thereby improving parts prediction accuracy. The integration of attention enhances feature representations by focusing on important features, while the feature fusion improves the fusion operation for different scales of features. An object-to-part training strategy is also used to address class-level ambiguity, improving the localisation of parts by exploiting prior knowledge of objects. Building on this foundation, the GRPSNet framework is introduced to further enhance the performance of multi-class part parsing. This framework integrates graph reasoning to capture relationships between parts, thereby improving part segmentation. These captured relationships help to enhance the recognition and localisation of parts. Moreover, the relationships of part boundaries are exploited to further enhance the accuracy of part segmentation. To further refine part segmentation, Multi-Class Boundaries integrated into the AFPSNet network. This integration aims to accurately identify and focus on the spatial boundaries of part classes, thereby enhancing the overall segmentation quality. Experimental results demonstrate the effectiveness of the proposed networks. Various evaluations, including ablation studies and comparisons with existing methods, were conducted on the widely used PASCAL-Part benchmark dataset and the large-scale ADE20K-Part benchmark dataset. These experiments validate the research hypotheses, showing notable improvements in part localisation and segmentation accuracy. | |
dc.format.extent | 154 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14154/75012 | |
dc.language.iso | en | |
dc.publisher | Cardiff University | |
dc.subject | Part parsing | |
dc.subject | Semantic segmentation | |
dc.subject | Scaled attention | |
dc.subject | Feature fusion | |
dc.subject | Graph reasoning | |
dc.subject | Multi-class boundaries | |
dc.subject | Deep learning | |
dc.title | Multi-Class Part Parsing based on Deep Learning | |
dc.type | Thesis | |
sdl.degree.department | School of Computer Science & Informatics | |
sdl.degree.discipline | Computer Vision | |
sdl.degree.grantor | Cardiff University | |
sdl.degree.name | Doctor of Philosophy |