RT info:eu-repo/semantics/doctoralThesis T1 Statistical distances for model validation and clustering. Applications to flow cytometry and fair learning. A1 Inouzhe Valdés, Hristo A2 Universidad de Valladolid. Instituto de Investigación en Matemáticas (IMUVA) K1 Estadística matemática K1 Número de clusters K1 12 Matemáticas AB This thesis has been developed at the University of Valladolid and IMUVA within theframework of the project Sampling, trimming, and probabilistic metric techniques. Statis-tical applications whose main researchers are Carlos Matr an Bea and Eustasio del BarrioTellado. Among the lines of research associated with the project are: model validation,Wasserstein distances and robust cluster analysis. It is precisely the work carried out inthese elds that gives rise to chapters 1,2 and 4 of this report.The work done in the eld of fair learning with Professor Jean-Michel Loubes, frequentcollaborator with Valladolid's team, during the international stay at the Paul SabatierUniversity of Toulouse, is the basis of Chapter 3 of this report.Therefore, this thesis is an exposition of the problems and results obtained in thedi erent elds previously mentioned. Due to the diversity of topics, we have decided tobase chapters on the works published or submitted to the present date, and thereforeeach chapter has a structure relatively independent of the others. In this way Chapter 1is based on the works [del Barrio et al., 2019e,del Barrio et al., 2019d], Chapter 2 is basedon the work [del Barrio et al., 2019c], Chapter 3 on the work [del Barrio et al., 2019b]and Chapter 4 shows results of a work in progress.In this introduction our objective is to present the main challenges we have faced, aswell as to briey present our most relevant results. On the other hand, each chapter willhave its own introduction where we will delve into the topics discussed below. With thisin mind, our intention is that the reader will have a general idea of what he or she will nd in each chapter and in this way will have the necessary information to face the moretechnical discussions that will be found there.Due to the diversity of topics dealt with in this report, we propose a non-linear reading.We suggest that the reader, after reading a section of the Introduction, moves to thecorresponding chapter. In this way the reader will have the relevant information moreat hand and will be able to follow better the exposition in each chapter. If on the otherhand there is a sequential reading of the document, we apologize in advance for somerepetitions and reiterations, which nevertheless seem to us to contribute positively to theunderstanding of this work. YR 2020 FD 2020 LK http://uvadoc.uva.es/handle/10324/43327 UL http://uvadoc.uva.es/handle/10324/43327 LA eng NO Departamento de Estadística e Investigación Operativa DS UVaDOC RD 22-nov-2024