Por favor, use este identificador para citar o enlazar este ítem:https://uvadoc.uva.es/handle/10324/69761
Título
Mappings and patterns to improve the triangular matrix product on distributed systems
Autor
Año del Documento
2023
Editorial
IEEE
Documento Fuente
Mappings and patterns to improve the triangular matrix product on distributed systems, Conference: 2023 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops)At: Santa Fe, Nevada, USA, October 2023
Resumen
Matrix multiplication is one of the most costly linear algebra operations, very often present in scientific computational applications. Current generic linear algebra libraries, such as ScaLAPACK and its recent evolution SLATE, include functionalities for generic and triangular matrix multiplication. They generally rely on block-cyclic partitioning, which has two main advantages. First, it provides good interoperability with other functionalities of the libraries. Second, it provides a good balance of computation and inter-process communications. The focus of these libraries is performance and scalability, targeting even huge number of processes. Nevertheless, many enterprises and computing centers work with commodity clusters or small partitions with a reduced amount of nodes. In this paper, we propose and evaluate a combination of data distributions and communication patterns intending to optimize the triangular matrix product in distributed memory systems when targeting commodity clusters (up to approximately 36 nodes). The main four ideas are: Use panels (horizontal or vertical band partitions) instead of tiling; avoid zero-elements in communication buffers; balance the number of elements in communicated buffers; and evaluate the performance when combined with both pipeline and broadcast communication strategies. We compare our implementation performance against the state-ofthe-art implementations provided by ScaLAPACK and SLATE. The results show that we outperform both of them. Our proposal is up to 41% faster than ScaLAPACK, and up to 6.7% faster than SLATE.
Materias (normalizadas)
Informática
Materias Unesco
1203 Ciencia de Los Ordenadores
3304
Palabras Clave
Triangular matrices, matrix product, distributed systems, SLATE, ScaLAPACK
Revisión por pares
SI
Version del Editor
Idioma
eng
Tipo de versión
info:eu-repo/semantics/publishedVersion
Derechos
openAccess
Aparece en las colecciones
Ficheros en el ítem