Skip to main content

Developing Ultrahigh-Resolution E3SM Land Model for GPU Systems...

by Peter D Schwartz, Dali Wang, Fengming Yuan, Peter E Thornton
Publication Type
Conference Paper
Book Title
Computational Science and Its Applications – ICCSA 2023
Publication Date
Page Numbers
277 to 290
Publisher Location
Cham, Switzerland
Conference Name
International Conference on Computational Science and Its Applications (ICCSA)
Conference Location
Athens, Greece
Conference Sponsor
Conference Date

Designing and refactoring complex scientific code, such as the E3SM land model (ELM), for new computing architectures is challenging. This paper presents design strategies and technical approaches to develop a data-oriented, GPU-ready ELM model using compiler directives (OpenACC/OpenMP). We first analyze the datatypes and processes in the original ELM code. Then we present design considerations for ultrahigh-resolution ELM (uELM) development for massive GPU systems. These techniques include the global data-oriented simulation workflow, domain partition, code porting and data copy, memory reduction, parallel loop restructure and flattening, and race condition detection. We implemented the first version of uELM using OpenACC targeting the NVidia GPUs in the Summit supercomputer at Oak Ridge National Laboratory. During the implementation, we developed a software tool (named SPEL) to facilitate code generation, verification, and performance tuning using these techniques. The first uELM implementation for Nvidia GPUs on Summit delivered promising results: 1) over 98% of the ELM code was automatically generated and tuned by scripts. Most ELM modules had better computational performances than the original ELM code for CPUs. The GPU-ready uELM is more scalable than the CPU code on fully-loaded Summit nodes. Example profiling results from several modules are also presented to illustrate the performance improvements and race condition detection. The lessons learned and toolkit developed in the study are also suitable for further uELM deployment using OpenMP on the first US exascale computer, Frontier, equipped with AMD CPUs and GPUs.