Skip to main content
SHARE
Publication

Accelerating Flash-X Simulations with Asynchronous I/O...

by Rajeev Jain, Houjun Tang, Akash Dhruv, James A Harris, Suren Byna
Publication Type
Conference Paper
Book Title
2022 IEEE/ACM International Parallel Data Systems Workshop (PDSW)
Publication Date
Page Numbers
13 to 19
Issue
1
Publisher Location
New Jersey, United States of America
Conference Name
7th International Parallel Data Systems Workshop (PDSW 2022)
Conference Location
Dallas, Texas, United States of America
Conference Sponsor
IEEE COMPUTER SOCIETY
Conference Date
-

Most high-fidelity physics simulation codes, such as Flash-X, need to save intermediate results (checkpoint files) to restart or gain insights into the evolution of the simulation. These simulation codes save such intermediate files synchronously, where computation is stalled while the data is written to storage. Depending on the problem size and computational requirements, this file write time can be a substantial portion of the total simulation time. In order to hide the I/O latency of checkpointing, asynchronous I/O methods have been introduced. These methods use background threads for performing I/O while the main threads continue with the simulation. The usage of background threads can compete for resources on the node as well as with communication. In this paper, we evaluate the overheads and the overall benefit of asynchronous I/O in HDF5 to simulations. Results from real-world high-fidelity simulations on the Summit supercomputer show that I/O operation is overlapped with application communication or computation or both, effectively hiding some or all of the I/O latency. Our evaluation shows that while using asynchronous I/O adds overhead to the application, the I/O time reduction is more significant, resulting in overall up to 1.5X performance speedup.