Abstract
Accelerator-based heterogeneous computing is the de facto standard in current and upcoming exascale machines. These heterogeneous resources empower computational scientists to select a machine or platform well-suited to their domain or applications. However, this diversity of machines also poses challenges related to programming model selection: inconsistent availability of programming models across different exascale systems, lack of performance portability for those programming models that do span several systems, and inconsistent performance between different models on a single platform. We explore these challenges on exascale-similar hardware, including AMD MI100 and NVIDIA A100 GPUs. By extending the sourceto-source compiler OpenARC, we demonstrate the power of automated translation of applications written in a single frontend programming model (OpenACC) into a variety of backend models (OpenMP, OpenCL, CUDA, HIP) that span the upcoming exascale environments. This translation enables us to compare performance within and across devices and to analyze programming model behavior with profiling tools.