Skip to main content
SHARE
Publication

Scaling Uintah on the Aurora Exascale System up to 122,880 Intel Ponte Vecchio Xe Stacks

by Marta Garcia, John K Holmen, Martin Berzins
Publication Type
Conference Paper
Book Title
PEARC '25: Practice and Experience in Advanced Research Computing 2025: The Power of Collaboration
Publication Date
Page Numbers
1 to 8
Publisher Location
New York, New York, United States of America
Conference Name
PEARC '25: Practice and Experience in Advanced Research Computing
Conference Location
Columbus, Ohio, United States of America
Conference Sponsor
ACM, SIGAPP, SIGHPC
Conference Date
-

The challenge of being able to scale application codes based on the Asynchronous Many-Task (AMT) Uintah framework on the Department of Energy (DOE) Aurora exascale system is addressed in this work by considering a challenging Reverse Monte Carlo Ray Tracing radiation benchmark calculation. This benchmark involves potentially global all-to-all communication and uses adaptive mesh refinement and ray tracing to achieve scalability. This benchmark has been used as part of previous scalability studies on a number of pre-exascale systems and on the DOE Frontier exascale system. This paper describes steps taken to enable this benchmark to run successfully on up to 10,240 nodes and 122,880 IntelĀ® Ponte Vecchio Xe stacks on the DOE Aurora exascale system. This scalability was achieved through a limited number of experiments on Aurora, given machine loads and its uniqueness. These experiments constitute valuable lessons learned to achieve scalability at this level. The resulting scalability runs, while few in number, demonstrate relatively good strong-scaling characteristics. A detailed analysis of these results provides important indications about the path to scalability on Aurora for future work. Overall, these results continue the remarkable ability of this AMT approach to produce scalable solutions for challenging problems at extreme scale on heterogeneous architectures.