Open SYCL on heterogeneous GPU systems: A case of study

Carratalá-Sáez, Rocío; Torres de la Sierra, Yuri; Llanos Ferraris, Diego Rafael; González Escribano, Arturo

doi:10.48550/arXiv.2310.06947

Por favor, use este identificador para citar o enlazar este ítem:https://uvadoc.uva.es/handle/10324/83868

Título

Open SYCL on heterogeneous GPU systems: A case of study

Autor

Carratalá-Sáez, Rocío

Torres de la Sierra, Yuri

Llanos Ferraris, Diego Rafael

González Escribano, Arturo

Editor

ArXiv

Año del Documento

2023

Descripción

Producción Científica

Documento Fuente

Open SYCL on heterogeneous GPU systems: A case of study, Rocío Carratalá-Sáez and Francisco J. andújar and Yuri Torres and Arturo Gonzalez-Escribano and Diego R. Llanos, ArXiv preprint 2310.06947, 2023.

Resumen

Computational platforms for high-performance scientic applications are becoming more heterogenous, including hardware accelerators such as multiple GPUs. Applications in a wide variety of scientic elds require an efcient and careful management of the computational resources of this type of hardware to obtain the best possible performance. However, there are currently different GPU vendors, architectures and families that can be found in heterogeneous clusters or machines. Programming with the vendor provided languages or frameworks, and optimizing for specic devices, may become cumbersome and compromise portability to other systems. To overcome this problem, several proposals for high-level heterogeneous programming have appeared, trying to reduce the development eort and increase functional and performance portability, specically when using GPU hardware accelerators. This paper evaluates the SYCL programming model, using the Open SYCL compiler, from two different perspectives: The performance it offers when dealing with single or multiple GPU devices from the same or different vendors, and the development effort required to implement the code. We use as case of study the Finite Time Lyapunov Exponent calculation over two real-world scenarios and compare the performance and the development eort of its Open SYCL-based version against the equivalent versions that use CUDA or HIP. Based on the experimental results, we observe that the use of SYCL does not lead to a remarkable overhead in terms of the GPU kernels execution time. In general terms, the Open SYCL development eort for the host code is lower than that observed with CUDA or HIP. Moreover, the SYCL version can take advantage of both CUDA and AMD GPU devices simultaneously much easier than directly using the vendor-specic programming solutions.

Materias (normalizadas)

Informática

Materias Unesco

1203 Ciencia de Los Ordenadores

3304 Tecnología de Los Ordenadores

Palabras Clave

Open SYCL, CUDA, HIP, Finite Time Lyapunov Exponent, Performance evauation, Development effort

Departamento

Departamento de Informática, Universidad de Valladolid

DOI

10.48550/arXiv.2310.06947

Patrocinador

This work was supported in part by the Spanish Ministerio de Ciencia e Innovaci´on and by the European Regional Development Fund (ERDF) program of the European Union, under Grant PID2022-142292NB-I00 (NATASHA Project); and in part by the Junta de Castilla y León - FEDER Grants, under Grant VA226P20 (PROPHET-2 Project), Junta de Castilla y León, Spain. This work was also supported in part by grant TED2021–130367B–I00, funded by European Union NextGenerationEU/ PRTR and byMCIN/AEI/10.13039/501100011033. This work has been also partially supported by NVIDIA Academic Hardware Grant Program.

Version del Editor

https://arxiv.org/abs/2310.06947