Skip to main content
SHARE
Publication

Distributed-Memory Parallel JointNMF

Publication Type
Conference Paper
Book Title
ICS '23: Proceedings of the 37th International Conference on Supercomputing
Publication Date
Page Numbers
301 to 312
Publisher Location
New York, New York, United States of America
Conference Name
International Conference on Supercomputing
Conference Location
Orlando, Florida, United States of America
Conference Sponsor
ACM
Conference Date
-

Joint Nonnegative Matrix Factorization (JointNMF) is a hybrid method for mining information from datasets that contain both feature and connection information. We propose distributed-memory parallelizations of three algorithms for solving the JointNMF problem based on Alternating Nonnegative Least Squares, Projected Gradient Descent, and Projected Gauss-Newton. We extend well-known communication-avoiding algorithms using a single processor grid case to our coupled case on two processor grids. We demonstrate the scalability of the algorithms on up to 960 cores (40 nodes) with 60% parallel efficiency. The more sophisticated Alternating Nonnegative Least Squares (ANLS) and Gauss-Newton variants outperform the first-order gradient descent method in reducing the objective on large-scale problems. We perform a topic modelling task on a large corpus of academic papers that consists of over 37 million paper abstracts and nearly a billion citation relationships, demonstrating the utility and scalability of the methods.