TOPOLOGICAL DATA ANALYSIS (TDA) AS A FEATURE EXTRACTION TOOL FOR EEG SIGNAL ANALYSIS IN SLEEP STAGING

No Thumbnail Available

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Saudi Digital Library

Abstract

Sleep is a biological process essential for all living organisms. For humans, it plays a fundamental role in regulating emotions, memory consolidation, cognitive function, and overall physical health. Despite its importance, many individuals remain unaware of chronic sleep deficiencies until diagnosed—often after years of suffering. Accurate diagnosis of sleep disorders requires reliable tools and methods, particularly in clinical settings. Electroencephalography (EEG) remains a widely used technique in the study of sleep for capturing brain signals that contain rich physiological information. However, EEG data are inherently high-dimensional and complex, posing challenges for analysis and interpretation. To address this, the goal of this dissertation is to develop an explainable dual hierarchical feature selection and dimensionality reduction framework aimed at improving sleep stage classification. The proposed framework consists of two stages. The first stage is feature construction and selection. Specifically, we integrate Topological Data Analysis (TDA) to explore the intrinsic structure of the data and extract both traditional statistical features and TDA-based features as a supplement to model training. Then, we use Recursive Feature Elimination with Cross-Validation (RFECV) to optimize feature selection. The second stage is to further reduce the dimensionality of the feature space. Four dimensionality reduction techniques are considered: Principal Component Analysis (PCA), t-distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and Kernel Principal Component Analysis (KPCA). Our results indicate that manifold learning algorithms generally outperform PCA; among them, t-SNE achieves the highest classification accuracy at 78.9%. This improvement arises because the TDA-based features can extract global structural patterns from EEG signals that traditional spectral–temporal metrics cannot capture. Thus, this study demonstrates that a structured, theory-driven approach can enhance both the performance and interpretability of machine learning models in sleep-stage classification. It also provides a practical framework for processing complex biomedical signals, with potential implications for real-world clinical applications.

Description

Keywords

Topological Data Analysis (TDA), Electroencephalography (EEG), Sleep Stage Classification, Recursive Feature Elimination with Cross-Validation (RFECV), Principal Component Analysis (PCA), t-distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), Kernel Principal Component Analysis (KPCA)

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2026