Skip to main content
SHARE
Publication

Revisiting Parallel Cyclic Reduction and Parallel Prefix-Based Algorithms for Block Tridiagonal System of Equations...

by Sudip K Seal, Kalyan S Perumalla, Steven P Hirshman
Publication Type
Journal
Journal Name
Journal of Parallel and Distributed Computing
Publication Date
Page Numbers
273 to 280
Volume
73
Issue
2

Simulations that require solutions of block tridiagonal systems of equations rely on fast parallel solvers for runtime efficiency. Leading parallel solvers that are highly effective for general systems of equations, dense or sparse, are limited in scalability when applied to block tridiagonal systems. This paper presents scalability results as well as detailed analyses of two parallel solvers that exploit the special structure of block tridiagonal matrices to deliver superior performance, often by orders of magnitude. A rigorous analysis of their relative parallel runtimes is shown to reveal the existence of a critical block size that separates the parameter space spanned by the number of block rows, the block size and the processor count, into distinct regions that favor one or the other of the two solvers. Dependence of this critical block size on the above parameters as well as on machine-specific constants is established. These formal insights are supported by empirical results on up to 2,048 cores of a Cray XT4 system. To the best of our knowledge, this is the highest reported scalability for parallel block tridiagonal solvers to date.