A Generative Adversarial Network-Based Approach for Imbalanced Network Traffic Classification Datasets

No Thumbnail Available

Date

2026

Journal Title

Journal ISSN

Volume Title

Publisher

Saudi Digital Library

Abstract

Class imbalance is a common issue in datasets, and it remains a critical challenge in network traffic classification (NTC), particularly as the number of applications and protocols continues to expand. This imbalance leads to biased machine learning models that favor majority classes while neglecting critical minority classes, reducing overall classification accuracy. Traditional approaches such as oversampling and undersampling struggle to generate diverse data while preserving essential traffic patterns. Consequently, there is a pressing need for improved solutions that enhance class representation without compromising the integrity of the original dataset. To address this issue, this study introduces Equal-GAN, a novel Generative Adversarial Network (GAN)-based approach designed to mitigate class imbalance in the Unicauca NTC Dataset, designing a tailored GAN architecture, and employing synthetic data generation to balance class distribution while retaining essential traffic characteristics. The study focuses on finding the limitations with current imbalance methods in NTC, creating the Equal-GAN algorithm to generate synthetic data, and testing its effectiveness with a classifier. The methodology involves pre-processing the dataset, designing a tailored GAN architecture, and employing synthetic data generation to balance class distribution while retaining essential traffic characteristics. The effectiveness of the Equal-GAN approach is assessed using statistical metrics and classification performance comparisons. Experimental results demonstrate that Equal-GAN significantly improves class representation and classification accuracy, using Random Forest Classifier, Equal-GAN approach attained an F1-score of 0.99 and an accuracy of 99.55% enhancing classification performance and outperforming traditional methods. These findings underscore the potential of GAN-based solutions to enhance NTC, paving the way for more reliable and fair network traffic analysis.

Description

Keywords

Class Imbalance, Network Traffic Classification, Generative Adversarial Network, Synthetic Data Generation

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2026