RT info:eu-repo/semantics/article T1 Mappings and patterns to improve the triangular matrix product on distributed systems A1 Santamaria-Valenzuela, Inmaculada A1 Carratalá-Sáez, Rocío A1 Torres, Yuri A1 Llanos, Diego R. A1 Gonzalez-Escribano, Arturo K1 Informática K1 Triangular matrices, matrix product, distributed systems, SLATE, ScaLAPACK K1 1203 Ciencia de Los Ordenadores K1 3304 AB Matrix multiplication is one of the most costly linear algebra operations, very often present in scientific computational applications. Current generic linear algebra libraries, such as ScaLAPACK and its recent evolution SLATE, include functionalities for generic and triangular matrix multiplication. They generally rely on block-cyclic partitioning, which has two main advantages. First, it provides good interoperability with other functionalities of the libraries. Second, it provides a good balance of computation and inter-process communications. The focus of these libraries is performance and scalability, targeting even huge number of processes. Nevertheless, many enterprises and computing centers work with commodity clusters or small partitions with a reduced amount of nodes. In this paper, we propose and evaluate a combination of data distributions and communication patterns intending to optimize the triangular matrix product in distributed memory systems when targeting commodity clusters (up to approximately 36 nodes). The main four ideas are: Use panels (horizontal or vertical band partitions) instead of tiling; avoid zero-elements in communication buffers; balance the number of elements in communicated buffers; and evaluate the performance when combined with both pipeline and broadcast communication strategies. We compare our implementation performance against the state-ofthe-art implementations provided by ScaLAPACK and SLATE. The results show that we outperform both of them. Our proposal is up to 41% faster than ScaLAPACK, and up to 6.7% faster than SLATE. PB IEEE YR 2023 FD 2023 LK https://uvadoc.uva.es/handle/10324/69761 UL https://uvadoc.uva.es/handle/10324/69761 LA eng NO Mappings and patterns to improve the triangular matrix product on distributed systems, Conference: 2023 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops)At: Santa Fe, Nevada, USA, October 2023 DS UVaDOC RD 21-ene-2025