Optimal 0−1 Loss Classification In Linear, Overlapping And Interpolation Settings

Alanazi, Reem

Optimal 0−1 Loss Classification In Linear, Overlapping And Interpolation Settings

dc.contributor.advisor	Max, Little
dc.contributor.author	Alanazi, Reem
dc.date.accessioned	2023-06-13T12:00:33Z
dc.date.available	2023-06-13T12:00:33Z
dc.date.issued	2022-09-07
dc.description	In this thesis, optimizing 0–1 loss directly is discussed from three perspectives. First, the consequences of replacing the 0–1 loss function with surrogate functions from the theoretical perspective for linearly separable data are outlined, showing that hinge loss is an exact minimizer for 0–1 loss under this setting. Second, in the case of overlapping, it is determined that the surrogate functions do not provide an exact solution. Therefore, we compared the actual time complexity of existing 0–1 loss optimization algorithms based on the following second-order factors: dimensionality of data and amount of overlap between classes. In addition, a new heuristic 0–1 loss optimization algorithm was introduced that can optimize 0–1 risk exactly in a practical way in O(√D log(1ε )N3) polynomial complexity class. A new algorithm, called testing all pairs of points (T-PP), as well as its prioritized version, was introduced; the prioritized version has an actual time complexity that is less than T-PP, and it is referred to as prioritizing the search by support vectors (PS-SVs). T-PP and PS-SVs can terminate before the time limit, whereas the time taken by other direct 0–1 loss optimization algorithms depends on second-order factors when the data size is fixed. Third, a new direction of ERM is discussed in a simple classification algorithm, which is the canonical K nearest neighbor (KNN) algorithm that still dominates by the conventional ERM. The interpolated local KNN is our new algorithm, which works on modern interpolation ERM view. The interpolated local KNN aims to fit the training points locally using a set of local K parameters that are determined by a pre-defined confidence score. The mechanism used in the interpolated local KNN helps to define the noise and overlap points and then isolate them from being used in predicting unseen points, whereas the iso- lated points have their own decision boundary that can be used to predict them correctly. Thus, we can have zero empirical risk and low generalisation risk compared with the canonical KNN. These contributions show the possibility of optimising the 0–1 loss directly in polynomial time for linearly separable data using a surrogate loss function; for overlapping data, our new algorithms T-PP and PS-SVs can be used. Ultimately, we introduce a first interpolation version for the canonical KNN algorithm. This thesis focuses on the binary classification problem; our main focus in the future will be to extend this work to multiclass classification.
dc.description.abstract	Classification problems represent a major subject in machine learning, and addressing them requires solving underlying optimization problems. Optimis- ing the 0–1 loss function is the natural pathway to address the classification problem. Because of the properties of 0–1 loss, which are that it is non-convex and non-differentiable, 0–1 loss is mathematically intractable and classified as non-deterministic polynomial-time hard (NP-hard). Consequently, the 0– 1 loss function has been replaced by surrogate loss functions that can be optimized more efficiently, where their optimal solution is guaranteed with respect to these surrogate losses. At the same time, these functions may not provide the same optimal solution as the 0–1 loss function. Indeed, the loss function used during the empirical risk minimization (ERM) is not the same loss function used in the evaluation; the mismatch of the loss functions leads to an approximate solution that may not be an ideal solution for 0–1 loss. Thus, an additional source of error is produced because of this replacement.
dc.format.extent	163
dc.identifier.uri	https://hdl.handle.net/20.500.14154/68360
dc.language.iso	en
dc.publisher	University of Birmingham
dc.subject	Classification
dc.subject	0–1 loss function
dc.subject	empirical risk minimization (ERM)
dc.subject	NP-hard
dc.subject	surrogate loss function
dc.title	Optimal 0−1 Loss Classification In Linear, Overlapping And Interpolation Settings
dc.type	Thesis
sdl.degree.department	Computer Science
sdl.degree.discipline	Machine learning
sdl.degree.grantor	University of Birmingham
sdl.degree.name	Doctor of Philosophy

Collections

SACM - United Kingdom

Optimal 0−1 Loss Classification In Linear, Overlapping And Interpolation Settings

Files

Collections