A Distributed OpenCL Framework using Redundant Computation and Data Replication

by Junghyun Kim, Jo Gangwon, Jung Jaehoon, Jungwon Kim, J. H. Lee

Publication Type

Conference Paper

Publication Date

June, 2016

Page Numbers

553 to 569

Conference Name

ACM SIGPLAN Conference on Programming Language Design and Implementation

Conference Location

Santa Barbara, California, United States of America

Conference Sponsor

ACM

Conference Date

Jun 13, 2016 - Jun 17, 2016

View DOI Listing

Abstract

Applications written solely in OpenCL or CUDA cannot execute on a cluster as a whole. Most previous approaches that extend these programming models to clusters are based on a common idea: designating a centralized host node and coordinating the other nodes with the host for computation. However, the centralized host node is a serious performance bottleneck when the number of nodes is large. In this paper, we propose a scalable and distributed OpenCL framework called SnuCL-D for large-scale clusters. SnuCL-D's remote device virtualization provides an OpenCL application with an illusion that all compute devices in a cluster are confined in a single node. To reduce the amount of control-message and data communication between nodes, SnuCL-D replicates the OpenCL host program execution and data in each node. We also propose a new OpenCL host API function and a queueing optimization technique that significantly reduce the overhead incurred by the previous centralized approaches. To show the effectiveness of SnuCL-D, we evaluate SnuCL-D with a microbenchmark and eleven benchmark applications on a large-scale CPU cluster and a medium-scale GPU cluster.

A Distributed OpenCL Framework using Redundant Computation and Data Replication

Abstract

Organizations