dc.contributor.author | Cámara Moreno, Jesús | |
dc.contributor.author | Cuenca, Javier | |
dc.contributor.author | Galindo, Víctor | |
dc.contributor.author | Vicente, Arturo | |
dc.contributor.author | Boratto, Murilo | |
dc.date.accessioned | 2025-03-04T13:25:56Z | |
dc.date.available | 2025-03-04T13:25:56Z | |
dc.date.issued | 2024 | |
dc.identifier.citation | The Journal of Supercomputing, 2024, vol. 81, n. 1 | es |
dc.identifier.issn | 0920-8542 | es |
dc.identifier.uri | https://uvadoc.uva.es/handle/10324/75227 | |
dc.description | Producción Científica | es |
dc.description.abstract | In this work, an automatic optimisation approach for parallel routines on multi-GPU
systems is presented. Several inter-GPU communication libraries (such as CUDA-
Aware MPI or NCCL) are used with a set of routines to perform the numerical oper-
ations among the GPUs located on the compute nodes. The main objective is the
selection of the most appropriate communication library, the number of GPUs to be
used and the workload to be distributed among them in order to reduce the cost of
data movements, which represent a large percentage of the total execution time. To
this end, a hierarchical modelling of the execution time of each routine to be opti-
mised is proposed, combining experimental and theoretical approaches. The results
show that near-optimal decisions are taken in all the scenarios analysed. | es |
dc.format.mimetype | application/pdf | es |
dc.language.iso | eng | es |
dc.publisher | Springer | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | * |
dc.subject.classification | Autotuning | es |
dc.subject.classification | Communication libraries | es |
dc.subject.classification | Multi-GPU | es |
dc.subject.classification | Heterogeneous computing | es |
dc.title | An autotuning approach to select the inter-GPU communication library on heterogeneous systems | es |
dc.type | info:eu-repo/semantics/article | es |
dc.rights.holder | © 2024 The Author(s) | es |
dc.identifier.doi | 10.1007/s11227-024-06794-3 | es |
dc.relation.publisherversion | https://link.springer.com/article/10.1007/s11227-024-06794-3 | es |
dc.identifier.publicationissue | 1 | es |
dc.identifier.publicationtitle | The Journal of Supercomputing | es |
dc.identifier.publicationvolume | 81 | es |
dc.peerreviewed | SI | es |
dc.description.project | Publicación en abierto financiada por el Consorcio de Bibliotecas Universitarias de Castilla y León (BUCLE), con cargo al Programa Operativo 2014ES16RFOP009 FEDER 2014-2020 DE CASTILLA Y LEÓN, Actuación:20007-CL - Apoyo Consorcio BUCLE | es |
dc.description.project | This work is supported by Grant PID2022-136315OB-I00 and Grant PID2022-142292NB-I00, both funded by MCIN/AEI/10.13039/501100011033/ and by “ERDF A way of making Europe”, EU | es |
dc.identifier.essn | 1573-0484 | es |
dc.rights | Atribución 4.0 Internacional | * |
dc.type.hasVersion | info:eu-repo/semantics/publishedVersion | es |
dc.subject.unesco | 1203.17 Informática | es |