Supervised contrastive learning over prototype-label embeddings for network intrusion detection

López Martín, Manuel; Sánchez Esguevillas, Antonio Javier; Arribas Sánchez, Juan Ignacio; Carro Martínez, Belén

doi:10.1016/j.inffus.2021.09.014

Título

Supervised contrastive learning over prototype-label embeddings for network intrusion detection

dc.contributor.author	López Martín, Manuel
dc.contributor.author	Sánchez Esguevillas, Antonio Javier
dc.contributor.author	Arribas Sánchez, Juan Ignacio
dc.contributor.author	Carro Martínez, Belén
dc.date.accessioned	2021-10-08T11:54:48Z
dc.date.available	2021-10-08T11:54:48Z
dc.date.issued	2022
dc.identifier.citation	Information Fusion, 2022, vol. 79, p. 200-228	es
dc.identifier.issn	1566-2535	es
dc.identifier.uri	https://uvadoc.uva.es/handle/10324/48972
dc.description	Producción Científica	es
dc.description.abstract	Contrastive learning makes it possible to establish similarities between samples by comparing their distances in an intermediate representation space (embedding space) and using loss functions designed to attract/repel similar/dissimilar samples. The distance comparison is based exclusively on the sample features. We propose a novel contrastive learning scheme by including the labels in the same embedding space as the features and performing the distance comparison between features and labels in this shared embedding space. Following this idea, the sample features should be close to its ground-truth (positive) label and away from the other labels (negative labels). This scheme allows to implement a supervised classification based on contrastive learning. Each embedded label will assume the role of a class prototype in embedding space, with sample features that share the label gathering around it. The aim is to separate the label prototypes while minimizing the distance between each prototype and its same-class samples. A novel set of loss functions is proposed with this objective. Loss minimization will drive the allocation of sample features and labels in embedding space. Loss functions and their associated training and prediction architectures are analyzed in detail, along with different strategies for label separation. The proposed scheme drastically reduces the number of pair-wise comparisons, thus improving model performance. In order to further reduce the number of pair-wise comparisons, this initial scheme is extended by replacing the set of negative labels by its best single representative: either the negative label nearest to the sample features or the centroid of the cluster of negative labels. This idea creates a new subset of models which are analyzed in detail. The outputs of the proposed models are the distances (in embedding space) between each sample and the label prototypes. These distances can be used to perform classification (minimum distance label), features dimensionality reduction (using the distances and the embeddings instead of the original features) and data visualization (with 2 or 3D embeddings). Although the proposed models are generic, their application and performance evaluation is done here for network intrusion detection, characterized by noisy and unbalanced labels and a challenging classification of the various types of attacks. Empirical results of the model applied to intrusion detection are presented in detail for two well-known intrusion detection datasets, and a thorough set of classification and clustering performance evaluation metrics are included.	es
dc.format.mimetype	application/pdf	es
dc.language.iso	eng	es
dc.publisher	Elsevier	es
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject.classification	Label embedding	es
dc.subject.classification	Incrustación de etiquetas	es
dc.subject.classification	Contrastive learning	es
dc.subject.classification	Aprendizaje contrastivo	es
dc.subject.classification	Network intrusion detection	es
dc.subject.classification	Detección de intrusos en red	es
dc.title	Supervised contrastive learning over prototype-label embeddings for network intrusion detection	es
dc.type	info:eu-repo/semantics/article	es
dc.rights.holder	© 2021 Elsevier	es
dc.identifier.doi	10.1016/j.inffus.2021.09.014	es
dc.relation.publisherversion	https://www.sciencedirect.com/science/article/pii/S1566253521001913?via%3Dihub	es
dc.peerreviewed	SI	es
dc.description.project	Ministerio de Ciencia, Innovación y Universidades - Agencia Estatal de Investigación - Fondo Europeo de Desarrollo Regional (grant RTI2018-098958-B-I00)	es
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.type.hasVersion	info:eu-repo/semantics/publishedVersion	es

Files in questo item

Nombre:: Supervised-contrastive-learnin ...
Dimensione:: 16.67Mb
Formato:: PDF

Mostra/Apri

Questo item appare nelle seguenti collezioni

DEP71 - Artículos de revista [366]

Mostra i principali dati dell'item

La licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 Internacional