RT info:eu-repo/semantics/article T1 Variational data generative model for intrusion detection A1 López Martín, Manuel A1 Carro Martínez, Belén A1 Sánchez Esguevillas, Antonio Javier K1 Intrusion detection K1 Detección de intrusos K1 Redes de datos K1 Data networks K1 3325 Tecnología de las Telecomunicaciones AB A Network Intrusion Detection System is a system which detects intrusive, malicious activities or policy violations in a host or hosts network. The ability to access balanced and diversified data to train the system is very important for any detection system. Intrusion data rarely have these characteristics, since samples of network traffic are strongly biased to normal traffic, being difficult to access traffic associated with intrusion events. Therefore, it is important to have a method to synthesize intrusion data with a probabilistic and behavioral structure similar to the original one. In this work, we provide such a method. Intrusion data have continuous and categorical features, with a strongly unbalanced distribution of intrusion labels. That is the reason why we generate synthetic samples conditioned to the distribution of labels. That is, from a particular set of labels, we generate training samples associated with that set of labels, replicating the probabilistic structure of the original data that comes from those labels. We use a generative model based on a customized variational autoencoder, using the labels of the intrusion class as an additional input to the network. This modification provides an advantage, as we can readily generate new data using only the labels, without having to rely on training samples as canonical representatives for each label, which makes the generation process more reliable, less complex and faster. We show that the synthetic data are similar to the real data, and that the new synthesized data can be used to improve the performance scores of common machine learning classifiers. PB Springer SN 0219-1377 YR 2018 FD 2018 LK https://uvadoc.uva.es/handle/10324/54306 UL https://uvadoc.uva.es/handle/10324/54306 LA eng NO Knowledge and Information Systems volume 60, 2019, pages 569–590 NO Producción Científica DS UVaDOC RD 28-dic-2024