Abstract
Accelerator devices are becoming a norm in High Performance Computing (HPC). With more systems opting for heterogeneous architectures, portable programming models like OpenMP and OpenACC are becoming increasingly important. The SPEC ACCEL 1.2 benchmark suite consists of comparable benchmarks in OpenCL, OpenMP 4.5, and OpenACC 2.5 that can be used to evaluate the performance and support for programming models and frameworks on heterogeneous platforms. In this paper we go beneath the normative metric of performance times and look at the individual kernels to study the usage, strengths, and weaknesses of the two prevalent portable heterogeneous programming models, OpenMP and OpenACC. From our analysis we identify that benchmarks like MRI-Q, SP and BT have better performance using OpenACC, while benchmarks like MiniGhost, LBM and LBDC do consistently better with the OpenMP programming model across supercomputers like Titan, and Summit.We deep dive into the kernels of select four benchmarks to answer questions like: Where does the benchmark spend most of its cycles? What is the parallelization strategy used? Why is one programming model more performant than the other? By identifying the similarities and dierences we want to contrast between the benchmark implementation strategies in the SPEC ACCEL 1.2 benchmarks and provide more insights into the OpenMP and OpenACC programming models.