AN UNSUPERVISED FRAMEWORK FOR ANALYSING HETEROGENEOUS LOG-FILES TO IDENTIFY MULTI-STAGE ATTACKS

AHMED ABDULRAHMAN ALGHAMDI

AN UNSUPERVISED FRAMEWORK FOR ANALYSING HETEROGENEOUS LOG-FILES TO IDENTIFY MULTI-STAGE ATTACKS

dc.contributor.advisor	Giles Reger
dc.contributor.author	AHMED ABDULRAHMAN ALGHAMDI
dc.date	2021
dc.date.accessioned	2022-05-26T16:18:02Z
dc.date.available	2022-05-26T16:18:02Z
dc.degree.department	Cyber Security
dc.degree.grantor	Computer Science
dc.description.abstract	Cyberattacks have become increasingly advanced and prevalent on a global scale. One of the most detrimental types of cyberattacks is the multi-stage attack, often referred to as an Advanced Persistent Threat (APT), which combines espionage and sabotage, often over long time periods. Detection of these attacks is extremely challenging due to their deceptive approaches. The sequential events of these attacks might appear benign when performed individually or from different sources. Furthermore, existing tools often restrict their attention to single sources or rely on known patterns of behaviour. Thus, there is a need for approaches that employ empirical behaviour analysis to overcome the lack of existing tools and enhance existing multi-layered defence strategies. This research develops a novel framework to identify patterns and correlations be- tween malicious behaviours of multi-stage attacks such as APT. This framework applies unsupervised learning to heterogeneous logs and is therefore called the Unsupervised Analysis for Heterogeneous Log-files (UAHL) framework. This framework investigates multi-origin heterogeneous log files, using machine learning, in three main phases to extract inner-behaviours of log files and construct patterns of those behaviours over the analysed files. Finally, an Action Centre is developed to present sequential behaviours of attacks, utilising a custom visualisation method. In addition, the Action Centre allows administrators to browse and filter attack profiles along with the ability to show similarity rates between those profiles in terms of their contained behaviours. The framework utilises a dynamic method to eliminate the need for manually pre- defining the clustering parameters, requiring a high field knowledge and significantly affecting results. Moreover, to evaluate the framework, we have produced a (publicly available) labelled version of the SotM43 dataset, as well as using another dataset for the evaluation. Our results demonstrate that the framework can (i) efficiently cluster inner-behaviours of security-related logs with high accuracy, (ii) extract patterns of malicious behaviour and correlation between those patterns from real-world data, and (iii) present results in a meaningful format, along with effectively measuring similarities between the attack profiles.
dc.identifier.uri	https://drepo.sdl.edu.sa/handle/20.500.14154/29341
dc.language.iso	en
dc.title	AN UNSUPERVISED FRAMEWORK FOR ANALYSING HETEROGENEOUS LOG-FILES TO IDENTIFY MULTI-STAGE ATTACKS
sdl.thesis.level	Doctoral
sdl.thesis.source	SACM - United Kingdom

Collections

SACM - United Kingdom

AN UNSUPERVISED FRAMEWORK FOR ANALYSING HETEROGENEOUS LOG-FILES TO IDENTIFY MULTI-STAGE ATTACKS

Files

Collections