Abstract
This paper describes how the OpenACC data model is implemented in current OpenACC compilers, ranging from research compilers (OpenUH and OpenARC) to a commercial compiler (the PGI OpenACC compiler). First, we summarize various memory architectures in today’s accelerator systems. We then describe details and issues in implementing the OpenACC data model in three different OpenACC compilers. This includes managing page tables, asynchronous data transfers, asynchronous memory allocate and free, host data construct, aliasing on a data directive, reusing device memory, partially present data, and adjacent data. We also discusses ongoing work to manage large, complex dynamic data structures. We measured the present table lookups, device memory allocation, pinned memory allocation, and managed memory in the three OpenACC compilers using eight OpenACC applications (seven from the SPEC ACCEL benchmark suite and a shock-hydrodynamics mini-application called LULESH).