Network intrusion detection with a novel hierarchy of distances between embeddings of hash IP addresses

López Martín, Manuel; Carro Martínez, Belén; Arribas Sánchez, Juan Ignacio; Sánchez Esguevillas, Antonio Javier

doi:10.1016/j.knosys.2021.106887

Título

Network intrusion detection with a novel hierarchy of distances between embeddings of hash IP addresses

dc.contributor.author	López Martín, Manuel
dc.contributor.author	Carro Martínez, Belén
dc.contributor.author	Arribas Sánchez, Juan Ignacio
dc.contributor.author	Sánchez Esguevillas, Antonio Javier
dc.date.accessioned	2022-07-25T07:41:21Z
dc.date.available	2022-07-25T07:41:21Z
dc.date.issued	2021
dc.identifier.citation	Knowledge-Based Systems, 2021, vol. 219, p. 106887	es
dc.identifier.issn	0950-7051	es
dc.identifier.uri	https://uvadoc.uva.es/handle/10324/54206
dc.description	Producción Científica	es
dc.description.abstract	Including high-dimensional categorical predictors in a machine learning model is a major challenge. This is particularly appropriate for the IP and Port addresses of network connections when they are considered as predictors (features) in machine learning models. These features are particularly important for network intrusion detection, as many attacks exploit information about IP/Port addresses. The sparsity and high dimensionality of these features make it difficult their inclusion into the models, being discarded as useful information in many cases. This work proposes to replace the original network addresses by new features based on a set of distances defined between different components of the source and destination IP and Port addresses. These distances incorporate information on the probability of co-occurrence of source and destination addresses. The distances are calculated using a dense, low-dimensional vector representation (embedding) of the different network address components. The embeddings are obtained with a neural network, which requires few computational resources, plus an additional hash function that collapses the extremely large range of IP and Port values, making the model implementation feasible. A self-supervised learning framework under a hierarchical model is used to train the encoding network. The novel features can be used to predict future co-occurrence of source and destination network addresses, and, when applied as features in a supervised model, they significantly increase the prediction performance of most classifiers for the detection of network intrusions. We demonstrate this prediction improvement over two modern network intrusion datasets: CICIDS2017 and CICDDoS2019.	es
dc.format.mimetype	application/pdf	es
dc.language.iso	eng	es
dc.publisher	Elsevier	es
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject.classification	Hash function	es
dc.subject.classification	Self-supervised learning	es
dc.subject.classification	Neural network	es
dc.subject.classification	Network address embedding	es
dc.subject.classification	Network intrusion detection	es
dc.title	Network intrusion detection with a novel hierarchy of distances between embeddings of hash IP addresses	es
dc.type	info:eu-repo/semantics/article	es
dc.identifier.doi	10.1016/j.knosys.2021.106887	es
dc.identifier.publicationfirstpage	106887	es
dc.identifier.publicationtitle	Knowledge-Based Systems	es
dc.identifier.publicationvolume	219	es
dc.peerreviewed	SI	es
dc.description.project	Ministerio de Ciencia, Innovación y Universidades Proyectos de I+D+i ‘‘Retos investigación’’, (grant RTI2018-098958- B-I00)	es
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.type.hasVersion	info:eu-repo/semantics/submittedVersion	es
dc.subject.unesco	33 Ciencias Tecnológicas	es
dc.subject.unesco	3325 Tecnología de las Telecomunicaciones	es

Ficheros en el ítem

Nombre:: Network-intrusion-detection.pdf
Tamaño:: 1.070Mb
Formato:: PDF

Visualizar/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

DEP71 - Artículos de revista [400]

Mostrar el registro sencillo del ítem

La licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 Internacional