On the choice of the best chunk size for the speculative execution of loops

Estebanez, Alvaro; Llanos Ferraris, Diego Rafael; Orden, David; Palop del Río, Belén

doi:10.1371/journal.pone.0267602

Título

On the choice of the best chunk size for the speculative execution of loops

dc.contributor.author	Estebanez, Alvaro
dc.contributor.author	Llanos Ferraris, Diego Rafael
dc.contributor.author	Orden, David
dc.contributor.author	Palop del Río, Belén
dc.date.accessioned	2024-09-23T07:53:27Z
dc.date.available	2024-09-23T07:53:27Z
dc.date.issued	2022
dc.identifier.citation	PLOS One, may 2022, ISSN 1932-6203.	es
dc.identifier.issn	1932-6203	es
dc.identifier.uri	https://uvadoc.uva.es/handle/10324/70095
dc.description	Producción Científica	es
dc.description.abstract	Loops are a rich source of parallelism. Unfortunately, many loops cannot be safely parallelized at compile time because the compiler is not able to guarantee that there will be no dependence violations. Thread-Level Speculation (TLS) techniques, either hardware or software-based, allow the parallel execution of non-analyzable loops, issuing the execution of blocks of consecutive iterations (called chunks) while a hardware or software monitor ensures that no dependence violations arise. If such a dependence violation occurs, the chunk that was fed with incorrect values is discarded and re-started, in order to consume the correct information. In the speculative execution of non-analyzable loops, it is very important to correctly choose the chunk size, because this choice dramatically affects the performance of the parallel execution. Bigger chunks imply less scheduling overheads, but smaller chunks allow fewer calculations to be discarded in the event of a dependence violation. To find a good chunk size is not a simple task, because loops may present dependencies that cannot be detected at compile time. In this paper, we present a comprehensive evaluation of different scheduling methods to estimate the optimal chunk size in the speculative execution of non-analyzable loops. This evaluation ranges from the simple, classical methods originally devised to achieve load balancing in loops with no dependencies, to methods that make some assumptions on the distribution pattern of dependencies, such as Meseta and Just-in-Time scheduling. We also propose and evaluate a general, more complex method called Moody Scheduling, that does not require a-priori assumptions to achieve the highest performance.	es
dc.format.mimetype	application/pdf	es
dc.language.iso	eng	es
dc.publisher	Plos ONE	es
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es
dc.subject	Informática	es
dc.title	On the choice of the best chunk size for the speculative execution of loops	es
dc.type	info:eu-repo/semantics/article	es
dc.identifier.doi	10.1371/journal.pone.0267602	es
dc.relation.publisherversion	https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0267602	es
dc.identifier.publicationfirstpage	e0267602	es
dc.identifier.publicationissue	5	es
dc.identifier.publicationtitle	PLOS ONE	es
dc.identifier.publicationvolume	17	es
dc.peerreviewed	SI	es
dc.identifier.essn	1932-6203	es
dc.type.hasVersion	info:eu-repo/semantics/publishedVersion	es
dc.subject.unesco	1203 Ciencia de Los Ordenadores	es
dc.subject.unesco	3304 Tecnología de Los Ordenadores	es

Ficheros en el ítem

Nombre:: journal.pone.0267602.pdf
Tamaño:: 1.212Mb
Formato:: PDF

Visualizar/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

DEP41 - Artículos de revista [137]

Mostrar el registro sencillo del ítem