Cleaning Big Data Streams: A Systematic Literature Review
No Thumbnail Available
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
MDPI
Abstract
In today’s big data era, cleaning big data streams has become a challenging task because
of the different formats of big data and the massive amount of big data which is being generated.
Many studies have proposed different techniques to overcome these challenges, such as cleaning
big data in real time. This systematic literature review presents recently developed techniques that
have been used for the cleaning process and for each data cleaning issue. Following the PRISMA
framework, four databases are searched, namely IEEE Xplore, ACM Library, Scopus, and Science
Direct, to select relevant studies. After selecting the relevant studies, we identify the techniques that
have been utilized to clean big data streams and the evaluation methods that have been used to
examine their efficiency. Also, we define the cleaning issues that may appear during the cleaning
process, namely missing values, duplicated data, outliers, and irrelevant data. Based on our study,
the future directions of cleaning big data streams are identified.
Description
Keywords
Citation
Alotaibi, O., Pardede, E., & Tomy, S. (2023). Cleaning big data streams: a systematic literature review. Technologies, 11(4), 101.
