Distributed-Memory Parallel JointNMF

Show authors

Publication Type

Conference Paper

Book Title

ICS '23: Proceedings of the 37th International Conference on Supercomputing

Publication Date

June, 2023

Page Numbers

301 to 312

Publisher Location

New York, New York, United States of America

Conference Name

International Conference on Supercomputing

Conference Location

Orlando, Florida, United States of America

Conference Sponsor

ACM

Conference Date

Jun 21, 2023 - Jun 23, 2023

View DOI Listing

Abstract

Joint Nonnegative Matrix Factorization (JointNMF) is a hybrid method for mining information from datasets that contain both feature and connection information. We propose distributed-memory parallelizations of three algorithms for solving the JointNMF problem based on Alternating Nonnegative Least Squares, Projected Gradient Descent, and Projected Gauss-Newton. We extend well-known communication-avoiding algorithms using a single processor grid case to our coupled case on two processor grids. We demonstrate the scalability of the algorithms on up to 960 cores (40 nodes) with 60% parallel efficiency. The more sophisticated Alternating Nonnegative Least Squares (ANLS) and Gauss-Newton variants outperform the first-order gradient descent method in reducing the objective on large-scale problems. We perform a topic modelling task on a large corpus of academic papers that consists of over 37 million paper abstracts and nearly a billion citation relationships, demonstrating the utility and scalability of the methods.

Distributed-Memory Parallel JointNMF

Abstract

Researchers

Organizations