Skip to main content
Publication

TGE: Machine Learning Based Task Graph Embedding for Large-Scale Topology Mapping...

Publication Type
Conference Paper
Journal Name
IEEE CLUSTER
Publication Date
Page Numbers
587 to 591
Volume
0
Issue
0
Conference Name
2017 IEEE International Conference on Cluster Computing (CLUSTER)
Conference Location
Honolulu, Hawaii, United States of America
Conference Sponsor
IEEE
Conference Date
-

Task mapping is an important problem in parallel and distributed computing. The goal in task mapping is to find an optimal layout of the processes of an application (or a task) onto a given network topology. We target this problem in the context of staging applications. A staging application consists of two or more parallel applications (also referred to as staging tasks) which run concurrently and exchange data over the course of computation. Task mapping becomes a more challenging problem in staging applications, because not only data is exchanged between the staging tasks, but also the processes of a staging task may exchange data with each other. We propose a novel method, called Task Graph Embedding (TGE), that harnesses the observable graph structures of parallel applications and network topologies. TGE employs a machine learning based algorithm to find the best representation of a graph, called an embedding, onto a space in which the task-to-processor mapping problem can be solved. We evaluate and demonstrate the effectiveness of TGE experimentally with the communication patterns extracted from runs of XGC, a large-scale fusion simulation code, on Titan.