Skip to main content
SHARE
Publication

A Scalable Parallel Hypergraph Generator (HyGen)...

by S M Shamimul Hasan, Neena Imam, Ramakrishnan Kannan
Publication Type
Conference Paper
Book Title
Proceedings of the 16th International Workshop on Mining and Learning with Graphs (MLG)
Publication Date
Conference Name
ACM SIGKDD 2020 - The 16th International Workshop on Mining and Learning with Graphs (MLG)
Conference Location
San Diego, California, United States of America
Conference Sponsor
SIGKDD
Conference Date
-

Graphs are extensively used to model real-world complex systems. An edge in a graph can model pairwise relationships. However, multiway relationships (connections between three or more vertices) are common in many complex systems such as cellular process, image segmentation, and circuit design. A graph edge cannot model multiway relationships. A hypergraph, which can connect more than two vertices, is thus a better option to model multiway relationships. A large-scale hypergraph analysis has the potential to find useful insights from a complex system and assist in knowledge discovery. Currently a limited number of hypergraphs exists that are representative of real-world datasets. Moreover, real-world hypergraph datasets are small in size and inadequate to incorporate future needs. A graph generator that can produce large-scale synthetic hypergraphs can solve the above mentioned problems. In this paper, we present a scalable parallel hypergraph generator (HyGen) based on the Message Passing Interface (MPI) standard. To generate hypergraphs, HyGen takes the following parameter values as inputs: i) number of vertices, ii) number of hyperedges, iii) number of clusters, iv) vertex distribution, v) hyperedge distribution, vi) local cluster cardinality, and vii) global cluster cardinality. We have demonstrated that HyGen can generate hypergraphs of various sizes in a scalable fashion. HyGen takes approximately four minutes to generate a hypergraph with 4.8 million vertices, 1.6 million hyperedges, and 800 clusters using 1,024 processes on a leadership class computing platform. Our strong and weak scaling experiments on supercomputers demonstrate that HyGen can quickly create large-scale hypergraphs in a parallel manner, thus providing a useful capability for hypergraph analysis.