dc.contributor.author | Moreton Fernández, Ana | |
dc.contributor.author | González Escribano, Arturo | |
dc.contributor.author | Llanos Ferraris, Diego Rafael | |
dc.date.accessioned | 2018-03-17T10:09:16Z | |
dc.date.available | 2018-12-01T00:40:30Z | |
dc.date.issued | 2017 | |
dc.identifier.citation | Parallel Computing, volume 69, nov. 2017, pages 45-62, ISSN 0167-8191, Elsevier | es |
dc.identifier.uri | http://uvadoc.uva.es/handle/10324/29112 | |
dc.description | Producción Científica | es |
dc.description.abstract | Current High Performance Computing (HPC) systems are typically built as interconnected clusters of shared-memory multicore computers. Several techniques to automatically generate parallel programs from high-level parallel languages or sequential codes have been proposed. To properly exploit the scalability of HPC clusters, these techniques should take into account the combination of data communication across distributed memory, and the exploitation of shared-memory models.
In this paper, we present a new communication calculation technique to be applied across different SPMD (Single Program Multiple Data) code blocks, containing several uniform data access expressions. We have implemented this technique in Trasgo, a programming model and compilation framework that transforms parallel programs from a high-level parallel specification that deals with parallelism in a unified, abstract, and portable way. The proposed technique computes at runtime exact coarse-grained communications for distributed message-passing processes. Applying this technique at runtime has the advantage of being independent of compile-time decisions, such as the tile size chosen for each process. Our approach allows the automatic generation of pre-compiled multi-level parallel routines, libraries, or programs that can adapt their communication, synchronization, and optimization structures to the target system, even when computing nodes have different capabilities. Our experimental results show that, despite our runtime calculation, our approach can automatically produce efficient programs compared with MPI reference codes, and with codes generated with auto-parallelizing compilers. | es |
dc.format.mimetype | application/pdf | es |
dc.language.iso | eng | es |
dc.publisher | Elsevier | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | |
dc.title | A Technique to Automatically Determine Ad-hoc Communication Patterns at Runtime | es |
dc.type | info:eu-repo/semantics/article | es |
dc.rights.holder | Elsevier | es |
dc.identifier.doi | 10.1016/j.parco.2017.08.009 | es |
dc.relation.publisherversion | https://www.sciencedirect.com/science/article/pii/S0167819117301254?via%3Dihub | es |
dc.peerreviewed | SI | es |
dc.description.embargo | 2018-12-01 | es |
dc.description.project | MICINN (Spain) and ERDF program of the European Union: HomProg-HetSys project (TIN2014-58876-P), CAPAP-H6 (TIN2016-81840- REDT), COST Program Action IC1305: Network for Sustainable Ultrascale Computing (NESUS), and by the computing facilities of Extremadura Research Centre for Advanced Technologies (CETA-CIEMAT), funded by the European Regional Development Fund (ERDF). CETACIEMAT belongs to CIEMAT and the Government of Spain. | es |
dc.rights | Attribution 4.0 International | |