A STUDY ON MANAGING AND EXPLAINING CONCEPT DRIFT

No Thumbnail Available
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Dynamic environments are challenging for creating machine learning models due to the vast amounts of online streaming data that change rapidly. The models need to maintain performance as the underlying data distributions change over time. With different machine learning techniques available, there is a question on which techniques can produce the best models faster for the ever-changing flux of data. The purpose of this investigation is to study and evaluate machine learning (ML) techniques which can automatically detect concept drift that occurs in online streaming data. This investigation evaluates the performance of these techniques in terms of training speed and accuracy in the presence of concept drift. It also provides a method to detect and explain the ability of a classifier to deal with different types of drift that occur in a data stream. The research validates the training speed and testing accuracy results on six datasets. The different machine learning techniques that were evaluated include: Extreme learning Machine (ELM) algorithm, online ELM-based algorithms, Deep Learning Algorithms, such as Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), and hybrid models. The hybrid models showed higher performance in speed for similar accuracy rates compared to the original base models. The research also demonstrates a method to measure the ability of learning models to withstand different types of concept drift in online data streams using four datasets. The key contributions of this research are (i) revised hybrid model that uses ELM and RNN for classification (ii) a comparative analysis of algorithms to deal with three types of concept drift and (iii) a method to visualize and measure the ability of a model to deal with concept drift.
Description
Keywords
Citation
Collections