Mostrar el registro sencillo del ítem

dc.contributor.authorOrtega-Arranz, Hector
dc.contributor.authorTorres, Yuri
dc.contributor.authorGonzalez-Escribano, Arturo
dc.contributor.authorLlanos, Diego R.
dc.date.accessioned2024-10-04T08:02:26Z
dc.date.available2024-10-04T08:02:26Z
dc.date.issued2014
dc.identifier.citationThe Journal of Supercomputing, Vol. 70, Issue 2, November 2014, pags. 786-798, ISSN 0920-8542es
dc.identifier.issn0920-8542es
dc.identifier.urihttps://uvadoc.uva.es/handle/10324/70416
dc.descriptionProducción Científicaes
dc.description.abstractDuring the last years, GPU manycore devices have demonstrated their usefulness to accelerate computationally intensive problems. Although arriving at a parallelization of a highly parallel algorithm is an affordable task, the optimization of GPU codes is a challenging activity. The main reason for this is the number of parameters, programming choices, and tuning techniques available, many of them related with complex and sometimes hidden architecture details. A useful strategy to systematically attack these optimization problems is to characterize the different kernels of the application, and use this knowledge to select appropriate configuration parameters. The All-Pair Shortest-Path (APSP) problem is a well-known problem in graph theory whose objective is to find the shortest paths between any pairs of nodes in a graph. This problem can be solved by highly parallel and computational intensive tasks, being a good candidate to be exploited by manycore devices. In this paper, we use kernel characterization criteria to optimize an APSP algorithm implementation for NVIDIA GPUs. Our experimental results show that the combined use of proper configuration policies, and the concurrent kernels capability of new CUDA architectures, leads to a performance improvement of up to 62 % with respect to one of the possible configurations recommended by CUDA, considered as baseline.es
dc.format.mimetypeapplication/pdfes
dc.language.isoenges
dc.publisherSpringeres
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.subjectInformáticaes
dc.subject.classificationAPSPes
dc.subject.classificationCache configurationes
dc.subject.classificationConcurrent kerneles
dc.subject.classificationGPUes
dc.subject.classificationKernel characterizationes
dc.subject.classificationThreadblock sizees
dc.titleOptimizing an APSP implementation for NVIDIA GPUs using kernel characterization criteriaes
dc.typeinfo:eu-repo/semantics/articlees
dc.identifier.doi10.1007/s11227-014-1212-zes
dc.relation.publisherversionhttps://link.springer.com/article/10.1007/s11227-014-1212-zes
dc.identifier.publicationfirstpage786es
dc.identifier.publicationissue2es
dc.identifier.publicationlastpage798es
dc.identifier.publicationtitleThe Journal of Supercomputinges
dc.identifier.publicationvolume70es
dc.peerreviewedSIes
dc.description.projectThis research has been partially supported by Ministerio de Economía y Competitividad (Spain) and ERDF program of the European Union: CAPAP-H4 network (TIN2011-15734-E), MOGECOPP project (TIN2011-25639); and Junta de Castilla y León (Spain) ATLAS project (VA172A12-2).es
dc.identifier.essn1573-0484es
dc.type.hasVersioninfo:eu-repo/semantics/publishedVersiones
dc.subject.unesco1203 Ciencia de Los Ordenadoreses
dc.subject.unesco3304 Tecnología de Los Ordenadoreses


Ficheros en el ítem

Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem