RT info:eu-repo/semantics/article T1 uBench: exposing the impact of CUDA block geometry in terms of performance A1 Torres, Yuri A1 González Escribano, Arturo A1 Llanos Ferraris, Diego Rafael K1 Informática K1 GPU K1 Benchmarking K1 CUDA K1 Fermi K1 Kepler K1 Performance measurement K1 1203 Ciencia de Los Ordenadores K1 3304 Tecnología de Los Ordenadores AB The choice of thread-block size and shape is one of the most important user decisions when a parallel problem is written for any CUDA architecture. The reason is that thread-block geometry has a significant impact on the global performance of the program. Unfortunately, the programmer has not enough information about the subtle interactions between this choice of parameters and the underlying hardware.This paper presents uBench, a complete suite of micro-benchmarks, in order to explore the impact on performance of (1) the thread-block geometry choice criteria, and (2) the GPU hardware resources and configurations. Each micro-benchmark has been designed to be as simple as possible to focus on a single effect derived from the hardware and thread-block parameter choice.As an example of the capabilities of this benchmark suite, this paper shows an experimental evaluation and comparison of Fermi and Kepler architectures. Our study reveals that, in spite of the new hardware details introduced by Kepler, the principles underlying the block geometry selection criteria are similar for both architectures. PB Springer SN 0920-8542 YR 2013 FD 2013 LK https://uvadoc.uva.es/handle/10324/70435 UL https://uvadoc.uva.es/handle/10324/70435 LA eng NO The Journal of Supercomputing, ISSN 0920-8542, vol. 65, no. 3, pags. 1150-1163, September 2013 NO Producción Científica DS UVaDOC RD 28-nov-2024