Optimal 0−1 Loss Classification In Linear, Overlapping And Interpolation Settings

dc.contributor.advisorMax, Little
dc.contributor.authorAlanazi, Reem
dc.date.accessioned2023-06-13T12:00:33Z
dc.date.available2023-06-13T12:00:33Z
dc.date.issued2022-09-07
dc.descriptionIn this thesis, optimizing 0–1 loss directly is discussed from three perspectives. First, the consequences of replacing the 0–1 loss function with surrogate functions from the theoretical perspective for linearly separable data are outlined, showing that hinge loss is an exact minimizer for 0–1 loss under this setting. Second, in the case of overlapping, it is determined that the surrogate functions do not provide an exact solution. Therefore, we compared the actual time complexity of existing 0–1 loss optimization algorithms based on the following second-order factors: dimensionality of data and amount of overlap between classes. In addition, a new heuristic 0–1 loss optimization algorithm was introduced that can optimize 0–1 risk exactly in a practical way in O(√D log(1ε )N3) polynomial complexity class. A new algorithm, called testing all pairs of points (T-PP), as well as its prioritized version, was introduced; the prioritized version has an actual time complexity that is less than T-PP, and it is referred to as prioritizing the search by support vectors (PS-SVs). T-PP and PS-SVs can terminate before the time limit, whereas the time taken by other direct 0–1 loss optimization algorithms depends on second-order factors when the data size is fixed. Third, a new direction of ERM is discussed in a simple classification algorithm, which is the canonical K nearest neighbor (KNN) algorithm that still dominates by the conventional ERM. The interpolated local KNN is our new algorithm, which works on modern interpolation ERM view. The interpolated local KNN aims to fit the training points locally using a set of local K parameters that are determined by a pre-defined confidence score. The mechanism used in the interpolated local KNN helps to define the noise and overlap points and then isolate them from being used in predicting unseen points, whereas the iso- lated points have their own decision boundary that can be used to predict them correctly. Thus, we can have zero empirical risk and low generalisation risk compared with the canonical KNN. These contributions show the possibility of optimising the 0–1 loss directly in polynomial time for linearly separable data using a surrogate loss function; for overlapping data, our new algorithms T-PP and PS-SVs can be used. Ultimately, we introduce a first interpolation version for the canonical KNN algorithm. This thesis focuses on the binary classification problem; our main focus in the future will be to extend this work to multiclass classification.
dc.description.abstractClassification problems represent a major subject in machine learning, and addressing them requires solving underlying optimization problems. Optimis- ing the 0–1 loss function is the natural pathway to address the classification problem. Because of the properties of 0–1 loss, which are that it is non-convex and non-differentiable, 0–1 loss is mathematically intractable and classified as non-deterministic polynomial-time hard (NP-hard). Consequently, the 0– 1 loss function has been replaced by surrogate loss functions that can be optimized more efficiently, where their optimal solution is guaranteed with respect to these surrogate losses. At the same time, these functions may not provide the same optimal solution as the 0–1 loss function. Indeed, the loss function used during the empirical risk minimization (ERM) is not the same loss function used in the evaluation; the mismatch of the loss functions leads to an approximate solution that may not be an ideal solution for 0–1 loss. Thus, an additional source of error is produced because of this replacement.
dc.format.extent163
dc.identifier.urihttps://hdl.handle.net/20.500.14154/68360
dc.language.isoen
dc.publisherUniversity of Birmingham
dc.subjectClassification
dc.subject0–1 loss function
dc.subjectempirical risk minimization (ERM)
dc.subjectNP-hard
dc.subjectsurrogate loss function
dc.titleOptimal 0−1 Loss Classification In Linear, Overlapping And Interpolation Settings
dc.typeThesis
sdl.degree.departmentComputer Science
sdl.degree.disciplineMachine learning
sdl.degree.grantorUniversity of Birmingham
sdl.degree.nameDoctor of Philosophy

Files

Copyright owned by the Saudi Digital Library (SDL) © 2024