The Detection of Advanced Persistent Threats in Software Defined Networks using Machine Learning
Abstract
A Software-Defined Network (SDN) is a new type of network architecture that separates
the control and network planes. The centralised controller can programmatically
manage the underlying network devices. Although SDN provides many advantages,
it raises new security challenges. A stealth attack is a particularly dangerous
kind of attack adopted by adversaries who aim to avoid detection, typically by
incurring lower levels of traffic during their activities than would arouse suspicion.
Advanced Persistent Threats (APTs) are sophisticated attacks that implement stealth
behaviour during their campaigns. They present major challenges to the security of
systems. Little research has been carried out on detecting APTs in the context of
SDNs. This is the focus of this thesis.
Initially, an enhancement of scanning capabilities in SDN is introduced and an
open source scanner tool is adapted to operate more stealthily (allowing extended
periods of time between operations it carries out). It has been made publicly and
freely available to researchers. In this thesis, it is used to generate datasets (using
Mininet) to train and evaluate detection models. Existing datasets do not adequately
represent the presence of APTs, or do not do so in the context of SDNs. Thus, generating
our own datasets was essential for the work in this thesis. However, we still
make use of existing datasets in our evaluations, e.g. to show our approaches may
still work effectively against non-APT threats. Of particular interest in this thesis is
the use of stealth techniques as part of ‘flow rule reconstruction’ attacks, where attackers
seek to infer aspects of packet handling policies that apply at targeted nodes.
Inferring such information facilitates further attacks.
The most common Machine Learning (ML) techniques for signature-based detection
(such as Decision Tree, K-Nearest Neighbour, Random Forest, XGBoost and
Support Vector Machine) and for anomaly-based detection (such as Local Outlier
Factor, Isolation Forest and One-class SVM) are evaluated. Consequently, XGBoost
is proposed as a signature-based model to detect known stealth attacks in SDN and
is shown to be highly effective.
Subsequently, a hybrid detection model is constructed by combining XGBoost
(as a signature-based detection module) and a One-class SVM (as an anomaly-based
detection module) leveraging the complementary aspects of these techniques to allow
known and unknown attacks to be detected. This is the first demonstration of
the effectiveness of a hybrid approach for APT detection in SDNs.
As systems evolve, the effectiveness of an ML-based classifier degrades because
the distribution of the data it needs to handle increasingly deviates from that over
which it was trained. This is known as concept drift. One cause of such drift is attackers
changing their behaviour. A hybrid system (signature-based detection using
an Adaptive Random Forest and anomaly-based detection using an Adaptive One-
Class SVM) is presented that uses concept drift detection to instigate appropriate
run-time model retraining. The approach can detect known and unknown attacks
and adapt itself incrementally when concept drift happens. This is the first time
concept drift has been considered in the context of intrusion detection for SDNs.
The validity of our IDS schemes is assessed using various datasets with different
attacks and network sizes. ML-pipeline techniques commonly ignored in the IDS literature
are employed as part of the work: hyperparameter tuning to generalised the
model, imbalanced datasets are subject to resampling to prevent bias in predictions
and feature reduction is employed to focus modelling on smaller numbers of highly
informative features. Our proposed models are compared with available benchmark
results in the field and also with competing approaches as part of our comprehensive
empirical evaluation. Performance metrics such as Accuracy, Recall, Precision, and
F1-score are used in the evaluation. These steps collectively ensure that our schemes
are robust, accurate, and capable of generalising to new attack scenarios.
Overall, we show how machine learning can effectively detect APT stealth attacks
under constant contextual conditions and under change. We address the detection
of both known and unseen attacks. This is the first thesis to comprehensively
address the effective detection of APTs in an SDN context and demonstrates that
machine learning has a critical part to play in addressing the challenges APTs pose
to SDNs.
Description
Keywords
Intrusion Detection System, Advanced Persistent Threatsts, Software Defined Network, Machine Learning, Concept Drift