Variational data generative model for intrusion detection

López Martín, Manuel; Carro Martínez, Belén; Sánchez Esguevillas, Antonio Javier

doi:10.1007/s10115-018-1306-7

Título

Variational data generative model for intrusion detection

dc.contributor.author	López Martín, Manuel
dc.contributor.author	Carro Martínez, Belén
dc.contributor.author	Sánchez Esguevillas, Antonio Javier
dc.date.accessioned	2022-07-27T11:02:34Z
dc.date.available	2022-07-27T11:02:34Z
dc.date.issued	2018
dc.identifier.citation	Knowledge and Information Systems volume 60, 2019, pages 569–590	es
dc.identifier.issn	0219-1377	es
dc.identifier.uri	https://uvadoc.uva.es/handle/10324/54306
dc.description	Producción Científica	es
dc.description.abstract	A Network Intrusion Detection System is a system which detects intrusive, malicious activities or policy violations in a host or hosts network. The ability to access balanced and diversified data to train the system is very important for any detection system. Intrusion data rarely have these characteristics, since samples of network traffic are strongly biased to normal traffic, being difficult to access traffic associated with intrusion events. Therefore, it is important to have a method to synthesize intrusion data with a probabilistic and behavioral structure similar to the original one. In this work, we provide such a method. Intrusion data have continuous and categorical features, with a strongly unbalanced distribution of intrusion labels. That is the reason why we generate synthetic samples conditioned to the distribution of labels. That is, from a particular set of labels, we generate training samples associated with that set of labels, replicating the probabilistic structure of the original data that comes from those labels. We use a generative model based on a customized variational autoencoder, using the labels of the intrusion class as an additional input to the network. This modification provides an advantage, as we can readily generate new data using only the labels, without having to rely on training samples as canonical representatives for each label, which makes the generation process more reliable, less complex and faster. We show that the synthetic data are similar to the real data, and that the new synthesized data can be used to improve the performance scores of common machine learning classifiers.	es
dc.format.mimetype	application/pdf	es
dc.language.iso	eng	es
dc.publisher	Springer	es
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject.classification	Intrusion detection	es
dc.subject.classification	Detección de intrusos	es
dc.subject.classification	Redes de datos	es
dc.subject.classification	Data networks	es
dc.title	Variational data generative model for intrusion detection	es
dc.type	info:eu-repo/semantics/article	es
dc.rights.holder	© 2018 The Author(s)
dc.identifier.doi	10.1007/s10115-018-1306-7	es
dc.relation.publisherversion	https://link.springer.com/article/10.1007/s10115-018-1306-7	es
dc.identifier.publicationfirstpage	569	es
dc.identifier.publicationissue	1	es
dc.identifier.publicationlastpage	590	es
dc.identifier.publicationtitle	Knowledge and Information Systems	es
dc.identifier.publicationvolume	60	es
dc.peerreviewed	SI	es
dc.description.project	Ministerio de Economía y Competitividad (Project TIN2014-57991-C3-2-P)	es
dc.identifier.essn	0219-3116	es
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.type.hasVersion	info:eu-repo/semantics/submittedVersion	es
dc.subject.unesco	3325 Tecnología de las Telecomunicaciones	es

Fichier(s) constituant ce document

Nom:: Variational-data-generative-mo ...
Taille:: 1.598Mo
Format:: PDF

Voir/Ouvrir

Ce document figure dans la(les) collection(s) suivante(s)

DEP71 - Artículos de revista [394]

Afficher la notice abrégée

Excepté là où spécifié autrement, la license de ce document est décrite en tant que Attribution-NonCommercial-NoDerivatives 4.0 Internacional