Por favor, use este identificador para citar o enlazar este ítem:https://uvadoc.uva.es/handle/10324/54159
Título
Exploratory study on class imbalance and solutions for network traffic classification
Autor
Año del Documento
2019
Editorial
Elsevier
Descripción
Producción Científica
Documento Fuente
Neurocomputing, 2019, vol. 343, p. 100-119
Abstract
Network Traffic Classification is a fundamental component in network management, and the fast-paced advances in Machine Learning have motivated the application of learning techniques to identify network traffic. The intrinsic features of Internet networks lead to imbalanced class distributions when datasets are conformed, phenomena called Class Imbalance and that is attaching an increasing attention in many research fields. In spite of performance losses due to Class Imbalance, this issue has not been thoroughly studied in Network Traffic Classification and some previous works are limited to few solutions and/or assumed misleading methodological approaches. In this article, we deal with Class Imbalance in Network Traffic Classification, studying the presence of this phenomenon and analyzing a wide number of solutions in two different Internet environments: a lab network and a high-speed backbone. Namely, we experimented with 21 data-level algorithms, six ensemble methods and one cost-level approach. Throughout the experiments performed, we have applied the most recent methodological aspects for imbalanced problems, such as: DOB-SCV validation approach or the performance metrics assumed. And last but not least, the strategies to tune parameters and our algorithm implementations to adapt binary methods to multiclass problems are presented and shared with the research community, including two ensemble techniques used for the first time in Machine Learning to the best of our knowledge. Our experimental results reveal that some techniques mitigated Class Imbalance with interesting benefit for traffic classification models. More specifically, some algorithms reached increases greater than 8% in overall accuracy and greater than 4% in AUC-ROC for the most challenging network scenario.
Materias Unesco
33 Ciencias Tecnológicas
3325 Tecnología de las Telecomunicaciones
Palabras Clave
Machine learning
Network management
Class Imbalance
Network traffic classification
ISSN
0925-2312
Revisión por pares
SI
Patrocinador
Ministerio de Economía y Competitividad y el Fondo de Desarrollo Regional (FEDER) dentro del proyecto "Inteligencia distribuida para el control y adaptación de redes dinámicas definidas por software”, (ref: TIN2014-57991- C3-2-P)
Idioma
eng
Tipo de versión
info:eu-repo/semantics/submittedVersion
Derechos
openAccess
Collections
Files in this item
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 Internacional