On Dimension Reduction and Variable Selection for High Dimensional Data

Thumbnail Image

Date

2024-06-06

Journal Title

Journal ISSN

Volume Title

Publisher

The University of Alabama

Abstract

Due to recent advancements in technologies and computing abilities, collecting and stor- ing data with thousands of features for each observation has become abundant, resulting in what is known as high-dimensional data. One major impact of high-dimensional data is the phenomenon known as the ’curse of dimensionality.’ Sufficient dimension reduction (SDR) techniques and variable selection have become crucial tools in parametric and non- parametric modeling in recent years. Extracting useful information from high-dimensional data is a challenging task. SDR techniques and variable selection methods aim to reduce the complexity of the data to facilitate decison making tools such as visualization, statistical modeling, and inferences. In this dissertation, we propose two methods for variable selection and sufficient dimension reduction, respectively. First, we develop a shrinkage estimation for varying coefficient models for panel data with separable and non-separable fixed effects. The Kernel Least Absolute Shrinkage and Selection Operator (KLASSO) [53] has been modified, enabling our proposed method to select the relevant features with their gradients while simultaneously identifying the correct non- separable individual fixed effects alongside the separable fixed effects. The proposed estimation method demonstrates a high accuracy rate, as shown in the simulation studies. Second, motivated by the work of [54] and the novel work of [27] which paved the road for new SDR methods, we embedded the elasticnet penalty with the Principal Support Vector Machine (PSVM) for dimension reduction. The proposed method has the ability to select and shrink the coefficients in the prime axes and then find a projection into a lower subspace while maintaining the useful information about the central subspace Sy|x of the regression model. The finite sample studies show significant improvements compared with the PSVM.

Description

Keywords

Sufficient Dimension Reduction (SDR), Variable Selection (VS), Support Vector Machine (SVM)

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2025