Toward a BLAS library truly portable across different accelerator types

Rodríguez Gutiez, Eduardo; Moreton Fernández, Ana; González Escribano, Arturo; Llanos Ferraris, Diego Rafael

doi:10.1007/s11227-019-02925-3

Título

Toward a BLAS library truly portable across different accelerator types

dc.contributor.author	Rodríguez Gutiez, Eduardo
dc.contributor.author	Moreton Fernández, Ana
dc.contributor.author	González Escribano, Arturo
dc.contributor.author	Llanos Ferraris, Diego Rafael
dc.date.accessioned	2019-11-05T18:08:18Z
dc.date.available	2019-11-05T18:08:18Z
dc.date.issued	2019
dc.identifier.citation	Towards a BLAS library truly portable between different Accelerator types. Eduardo Rodríguez Gutiez, Arturo González Escribano, Diego R. Llanos. The Journal of Supercomputing (Q2). First online: 10 june 2019. DOI: 10.1007/s11227-019-02925-3	es
dc.identifier.issn	0920-8542	es
dc.identifier.uri	http://uvadoc.uva.es/handle/10324/39040
dc.description.abstract	Scientific applications are some of the most computationally demanding software pieces. Their core is usually a set of linear algebra operations, which may represent a significant part of the overall run-time of the application. BLAS libraries aim to solve this problem by exposing a set of highly optimized, reusable routines. There are several implementations specifically tuned for different types of computing platforms, including coprocessors. Some examples include the one bundled with the Intel MKL library, which targets Intel CPUs or Xeon Phi coprocessors, or the cuBLAS library, which is specifically designed for NVIDIA GPUs. Nowadays, computing nodes in many supercomputing clusters include one or more different coprocessor types. To fully exploit these platforms might require programs that can adapt at run-time to the chosen device type, hardwiring in the program the code needed to use a different library for each device type that can be selected. This also forces the programmer to deal with different interface particularities and mechanisms to manage the memory transfers of the data structures used as parameters. This paper presents a unified, performance-oriented, and portable interface for BLAS. This interface has been integrated into a heterogeneous programming model (Controllers) which supports groups of CPU cores, Xeon Phi accelerators, or NVIDIA GPUs in a transparent way. The contribution of this paper includes: An abstraction layer to hide programming differences between diverse BLAS libraries; new types of kernel classes to support the context manipulation of different external BLAS libraries; a new kernel selection policy that considers both programmer kernels and different external libraries; a complete new Controller library interface for the whole collection of BLAS routines. This proposal enables the creation of BLAS-based portable codes that can execute on top of different types of accelerators by changing a single initialization parameter. Our software internally exploits different preexisting and widely known BLAS library implementations, such as cuBLAS, MAGMA, or the one found in Intel MKL. It transparently uses the most appropriate library for the selected device. Our experimental results show that our abstraction does not introduce significant performance penalties, while achieving the desired portability.	es
dc.format.mimetype	application/pdf	es
dc.language.iso	spa	es
dc.publisher	The Journal of Supercomputing	es
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es
dc.title	Toward a BLAS library truly portable across different accelerator types	es
dc.type	info:eu-repo/semantics/article	es
dc.identifier.doi	10.1007/s11227-019-02925-3	es
dc.identifier.publicationtitle	The Journal of Supercomputing	es
dc.peerreviewed	SI	es
dc.description.project	Este trabajo forma parte del proyecto de investigación PCAS Grant TIN2017-88614-R y la Junta de Castilla y León, proyecto PROPHET, VA082P17.	es
dc.identifier.essn	1573-0484	es
dc.type.hasVersion	info:eu-repo/semantics/draft	es

Files in this item

Name:: 007-towards-BLAS.pdf
Size:: 1.422Mb
Format:: PDF

FilesOpen

This item appears in the following Collection(s)

DEP41 - Artículos de revista [111]

Show simple item record