Show simple item record

dc.contributor.authorLópez Martín, Manuel
dc.contributor.authorCarro Martínez, Belén 
dc.contributor.authorSánchez Esguevillas, Antonio Javier
dc.date.accessioned2022-07-27T11:02:34Z
dc.date.available2022-07-27T11:02:34Z
dc.date.issued2018
dc.identifier.citationKnowledge and Information Systems volume 60, 2019, pages 569–590es
dc.identifier.issn0219-1377es
dc.identifier.urihttps://uvadoc.uva.es/handle/10324/54306
dc.descriptionProducción Científicaes
dc.description.abstractA Network Intrusion Detection System is a system which detects intrusive, malicious activities or policy violations in a host or hosts network. The ability to access balanced and diversified data to train the system is very important for any detection system. Intrusion data rarely have these characteristics, since samples of network traffic are strongly biased to normal traffic, being difficult to access traffic associated with intrusion events. Therefore, it is important to have a method to synthesize intrusion data with a probabilistic and behavioral structure similar to the original one. In this work, we provide such a method. Intrusion data have continuous and categorical features, with a strongly unbalanced distribution of intrusion labels. That is the reason why we generate synthetic samples conditioned to the distribution of labels. That is, from a particular set of labels, we generate training samples associated with that set of labels, replicating the probabilistic structure of the original data that comes from those labels. We use a generative model based on a customized variational autoencoder, using the labels of the intrusion class as an additional input to the network. This modification provides an advantage, as we can readily generate new data using only the labels, without having to rely on training samples as canonical representatives for each label, which makes the generation process more reliable, less complex and faster. We show that the synthetic data are similar to the real data, and that the new synthesized data can be used to improve the performance scores of common machine learning classifiers.es
dc.format.mimetypeapplication/pdfes
dc.language.isoenges
dc.publisherSpringeres
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subject.classificationIntrusion detectiones
dc.subject.classificationDetección de intrusoses
dc.subject.classificationRedes de datoses
dc.subject.classificationData networkses
dc.titleVariational data generative model for intrusion detectiones
dc.typeinfo:eu-repo/semantics/articlees
dc.rights.holder© 2018 The Author(s)
dc.identifier.doi10.1007/s10115-018-1306-7es
dc.relation.publisherversionhttps://link.springer.com/article/10.1007/s10115-018-1306-7es
dc.identifier.publicationfirstpage569es
dc.identifier.publicationissue1es
dc.identifier.publicationlastpage590es
dc.identifier.publicationtitleKnowledge and Information Systemses
dc.identifier.publicationvolume60es
dc.peerreviewedSIes
dc.description.projectMinisterio de Economía y Competitividad (Project TIN2014-57991-C3-2-P)es
dc.identifier.essn0219-3116es
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.type.hasVersioninfo:eu-repo/semantics/submittedVersiones
dc.subject.unesco3325 Tecnología de las Telecomunicacioneses


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record