Skip to main content

A communication-avoiding 3D sparse triangular solver...

by Piyush K Sao, Ramakrishnan Kannan, Xiaoye Li, Richard Vuduc
Publication Type
Conference Paper
Book Title
Proceeding ICS '19 Proceedings of the ACM International Conference on Supercomputing
Publication Date
Page Numbers
127 to 137
Publisher Location
United States of America
Conference Name
International Conference on Supercomputing (ICS 2019)
Conference Location
Phoenix, Arizona, United States of America
Conference Sponsor
Conference Date

We present a novel distributed memory algorithm to improve the strong scalability of the solution of a sparse triangular system. This operation appears in the solve phase of direct methods for solving general sparse linear systems, Ax = b. Our 3D sparse triangular solver employs several techniques, including a 3D MPI process grid, elimination tree parallelism, and data replication, all of which reduce the per-process communication when combined. We present analytical models to understand the communication cost of our algorithm and show that our 3D sparse triangular solver can reduce the per-process communication volume asymptotically by a factor of O(n1/4) and O(n1/6) for problems arising from the finite element discretizations of 2D "planar" and 3D "non-planar" PDEs, respectively. We implement our algorithm for use in SuperLU_DIST3D, using a hybrid MPI+OpenMP programming model. Our 3D triangular solve algorithm, when run on 12k cores of Cray XC30, outperforms the current state-of-the-art 2D algorithm by 7.2x for planar and 2.7x for the non-planar sparse matrices, respectively.