RT info:eu-repo/semantics/doctoralThesis T1 Application of advanced machine learning techniques to early network traffic classification A1 Egea Gómez, Santiago A2 Universidad de Valladolid. Escuela Técnica Superior de Ingenieros de Telecomunicación K1 Aprendizaje automático K1 Information Systems Applications K1 3325 Tecnología de las Telecomunicaciones AB The fast-paced evolution of the Internet is drawing a complex context whichimposes demanding requirements to assure end-to-end Quality of Service. Thedevelopment of advanced intelligent approaches in networking is envisioningfeatures that include autonomous resource allocation, fast reaction againstunexpected network events and so on. Internet Network Traffic Classificationconstitutes a crucial source of information for Network Management, being decisivein assisting the emerging network control paradigms. Monitoring traffic flowingthrough network devices support tasks such as: network orchestration, trafficprioritization, network arbitration and cyberthreats detection, amongst others.The traditional traffic classifiers became obsolete owing to the rapid Internetevolution. Port-based classifiers suffer from significant accuracy losses due to portmasking, meanwhile Deep Packet Inspection approaches have severe user-privacylimitations. The advent of Machine Learning has propelled the application ofadvanced algorithms in diverse research areas, and some learning approaches haveproved as an interesting alternative to the classic traffic classification approaches.Addressing Network Traffic Classification from a Machine Learning perspectiveimplies numerous challenges demanding research efforts to achieve feasibleclassifiers. In this dissertation, we endeavor to formulate and solve importantresearch questions in Machine-Learning-based Network Traffic Classification. As aresult of numerous experiments, the knowledge provided in this research constitutesan engaging case of study in which network traffic data from two differentenvironments are successfully collected, processed and modeled.Firstly, we approached the Feature Extraction and Selection processes providing ourown contributions. A Feature Extractor was designed to create Machine-Learningready datasets from real traffic data, and a Feature Selection Filter based on fastcorrelation is proposed and tested in several classification datasets. Then, theoriginal Network Traffic Classification datasets are reduced using our SelectionFilter to provide efficient classification models. Many classification models based onCART Decision Trees were analyzed exhibiting excellent outcomes in identifyingvarious Internet applications. The experiments presented in this research comprisea comparison amongst ensemble learning schemes, an exploratory study on ClassImbalance and solutions; and an analysis of IP-header predictors for early trafficclassification. This thesis is presented in the form of compendium of JCR-indexedscientific manuscripts and, furthermore, one conference paper is included.In the present work we study a wide number of learning approaches employing themost advance methodology in Machine Learning. As a result, we identify thestrengths and weaknesses of these algorithms, providing our own solutions toovercome the observed limitations. Shortly, this thesis proves that MachineLearning offers interesting advanced techniques that open prominent prospects inInternet Network Traffic Classification. YR 2020 FD 2020 LK http://uvadoc.uva.es/handle/10324/43197 UL http://uvadoc.uva.es/handle/10324/43197 LA eng NO Departamento de Teoría de la Señal y Comunicaciones e Ingeniería Telemática DS UVaDOC RD 24-nov-2024