dc.contributor.author | Ortega Arranz, Héctor | |
dc.contributor.author | Torres de la Sierra, Yuri | |
dc.contributor.author | González Escribano, Arturo | |
dc.contributor.author | Llanos Ferraris, Diego Rafael | |
dc.date.accessioned | 2024-10-30T16:53:05Z | |
dc.date.available | 2024-10-30T16:53:05Z | |
dc.date.issued | 2013 | |
dc.identifier.citation | Conference: Computational and Mathematical Methods in Science and Engineering, CMMSE 2013, Almería, Spain, ISBN 978-84-616-2723-3 | es |
dc.identifier.isbn | 978-84-616-2723-3 | es |
dc.identifier.uri | https://uvadoc.uva.es/handle/10324/71120 | |
dc.description | Producción Científica | es |
dc.description.abstract | The All-Pair Shortest-Path (APSP) problem is a well-known problem in graph theory whose objective is to find the shortest paths between any pair of nodes. Computing the distances from one source node to the rest and repeating this process for every node of the graph is an adequate solution for sparse graphs. During the last years the application of GPU devices have increased to accelerate this kind of problems. While the correctness of an NVIDIA CUDA implementation of this algorithm is easy to achieve, exploiting the GPU capabilities to obtain a good performance is a task for CUDA experienced programmers. A typical code tuning strategy is the selection of an appropriate threadBlocks size. Besides this, the concurrent deployment of several kernels that computes distances from different sources, also accelerates the execution times. In this paper we show that an adequate combination of both strategies represents a 11.5 % performance improvement between different, recommended CUDA configurations for the most costly kernel of the APSP problem. | es |
dc.format.extent | 12 p. | es |
dc.format.mimetype | application/pdf | es |
dc.language.iso | eng | es |
dc.publisher | Universidad de Salamanca | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.subject | Informática | es |
dc.subject.classification | APSP | es |
dc.subject.classification | Concurrent-kernel | es |
dc.subject.classification | Dijkstra | es |
dc.subject.classification | GPU | es |
dc.subject.classification | SSSP | es |
dc.subject.classification | ThreadBlock size | es |
dc.title | A Tuned, Concurrent-Kernel Approach to Speed Up the APSP problem | es |
dc.type | info:eu-repo/semantics/conferenceObject | es |
dc.identifier.doi | 10.5281/zenodo.14014033 | es |
dc.relation.publisherversion | https://www.researchgate.net/publication/237149119_A_Tuned_Concurrent-Kernel_Approach_to_Speed_Up_the_APSP_problem | es |
dc.title.event | Computational and Mathematical Methods in Science and Engineering, CMMSE 2013 | es |
dc.description.project | This research is partly supported by the Spanish Government (TIN2007-62302, TIN2011-25639, CENIT OCEANLIDER, CAPAP-H networks TIN2010-12011-E and TIN2011-15734-E), Junta de Castilla y León, Spain (VA094A08, VA172A12-2), the HPC-EUROPA2 project (project number: 228398) with the support of the European Commission - Capacities Area - Research Infrastructures Initiative, and the ComplexHPC COST Action | es |
dc.type.hasVersion | info:eu-repo/semantics/publishedVersion | es |
dc.subject.unesco | 1203 Ciencia de Los Ordenadores | es |
dc.subject.unesco | 3304 Tecnología de Los Ordenadores | es |