Skip to main content

Developing an ELM Ecosystem Dynamics Model on GPU with OpenACC...

by Peter D Schwartz, Dali Wang, Fengming Yuan, Peter E Thornton
Publication Type
Conference Paper
Journal Name
International Conference on Computational Science
Book Title
Proceedings of the 22nd International Conference on Computational Science, Part II
Publication Date
Page Numbers
291 to 303
Publisher Location
Cham, Switzerland
Conference Name
ICCS: International Conference on Computational Science
Conference Location
London, United Kingdom
Conference Sponsor
Conference Date

Porting a complex scientific code, such as the E3SM land model (ELM), onto a new computing architecture is challenging. The paper presents design strategies and technical approaches to develop an ELM ecosystem dynamics model with compiler directives (OpenACC) on NVIDIA GPUs. The code has been refactored with advanced OpenACC features (such as deepcopy and routine directives) to reduce memory consumption and to increase the levels of parallelism through parallel loop reconstruction and new data structures. As a result, the optimized parallel implementation achieved more than a 140-time speedup (50 ms vs 7600 ms), compared to a naive implementation that uses OpenACC routine directive and parallelizes the code across existing loops on a single NVIDIA V100. On a fully loaded computing node with 44 CPUs and 6 GPUs, the code achieved over a 3.0-times speedup, compared to the original code on the CPU. Furthermore, the memory footprint of the optimized parallel implementation is 300 MB, which is around 15% of the 2.15 GB of memory consumed by a naive implementation. This study is the first effort to develop the ELM component on GPUs efficiently to support ultra-high-resolution land simulations at continental scales.