Real-Time IoT Data Cleaning and Anomaly Detection Using Context-Aware Frameworks and Large Language Models

dc.contributor.advisorEric, Pardede
dc.contributor.advisorSarath, Tomy
dc.contributor.authorAlotaibi, Obaid Haylan B
dc.date.accessioned2026-03-29T05:46:20Z
dc.date.issued2026
dc.description.abstractThe Internet of Things (IoT) has delivered significant benefits to various domains such as healthcare, business, and industry by generating vast amounts of data in real time. However, IoT-generated data often suffers from low quality due to issues that can significantly affect data analysis results and lead to inaccurate decision making. Enhancing the quality of real-time data streams has become a challenging task because the characteristics of IoT data make anomaly detection particularly chal lenging, which is crucial for informed decisions. Traditional IoT data cleaning tech niques primarily rely on batch processing methods, which introduce latency and fail to effectively handle real-time streaming IoT data. Many studies have proposed different techniques to overcome these challenges, such as cleaning data in real time; however, no comprehensive data cleaning framework has been proposed. This thesis proposes a comprehensive streaming data cleaning framework aimed at improving the quality of real-time data streams. Central to this framework is a real-time anomaly detection model for structured IoT data streams. The model de tects multiple types of anomalies and classifies them as either significant events or errors. Additionally, the proposed method incorporates context-awareness to further enhance detection reliability. Building upon this detection capability, the framework includes an automated repair system that addresses detected anom alies via multiple repair techniques: delete, replace, or keep, using statistical mea surements and machine learning based on anomaly classification. To enhance user decision-making, the framework integrates large language mod els for data stream cleaning, providing context-aware recommendations and sen sitivity assessments. Large language models operate locally, assisting users to dynamically refine contexts and sensitivity levels based on real-time interaction streams across diverse applications. Overall, this thesis highlights the proposed framework’s effectiveness in guiding users by providing a clear picture, thereby enhancing decision-making accuracy in real-time environments and enabling confident, real-time responses to genuine anomalies.
dc.format.extent158
dc.identifier.urihttps://hdl.handle.net/20.500.14154/78505
dc.language.isoen
dc.publisherSaudi Digital Library
dc.subjectreal-time
dc.subjectdata streams
dc.subjectanomaly detection
dc.subjectcontext-awareness
dc.subjectrule based
dc.subjectdata cleaning
dc.subjectdata repairing
dc.subjectdata anomaly
dc.subjectmachine learning
dc.subjecthealthcare
dc.subjectNear real-time
dc.subjectContext awareness recommendation
dc.subjectData sensitivity level
dc.subjectLarge language model (LLM)
dc.titleReal-Time IoT Data Cleaning and Anomaly Detection Using Context-Aware Frameworks and Large Language Models
dc.typeThesis
sdl.degree.departmentSchool of Computing, Engineering and Mathematical Sciences
sdl.degree.disciplineComputer Science
sdl.degree.grantorLa Trobe University
sdl.degree.nameDoctor of Philosophy
sdl.thesis.sourceSACM - Australia

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
SACM-Dissertation.pdf
Size:
2.36 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed to upon submission
Description:

Collections

Copyright owned by the Saudi Digital Library (SDL) © 2026