A Vision for Coupling Operation of US Fusion Facilities with HPC Systems and the Implications for Workflows and Data Management

Publication Type

Conference Paper

Book Title

Accelerating Science and Engineering Discoveries Through Integrated Research Infrastructure for Experiment, Big Data, Modeling and Simulation

Publication Date

January, 2023

Page Numbers

87 to 100

Volume

1690

Publisher Location

Cham, Switzerland

Conference Name

22nd Smoky Mountains Computational Sciences and Engineering Conference (SMC)

Conference Location

Oak Ridge, Tennessee, United States of America

Conference Sponsor

Oak Ridge National Laboratory

Conference Date

Aug 23, 2022 - Aug 25, 2022

View DOI Listing

Abstract

The operation of large US Department of Energy (DOE) research facilities, like the DIII-D National Fusion Facility, results in the collection of complex multi-dimensional scientific datasets, both experimental and model-generated. In the future, it is envisioned that integrated data analysis coupled with large-scale high performance computing (HPC) simulations will be used to improve experimental planning and operation. Practically, massive data sets from these simulations provide the physics basis for generation of both reduced semi-analytic and machine-learning-based models. Storage of both HPC simulation datasets (generated from US DOE leadership computing facilities) and experimental datasets presents significant challenges. In this paper, we present a vision for a DOE-wide data management workflow that integrates US DOE fusion facilities with leadership computing facilities. Data persistence and long-term availability beyond the length of allocated projects is essential, particularly for verification and recalibration of artificial intelligence and machine learning (AI/ML) models. Because these data sets are often generated and shared among hundreds of users across multiple leadership computing facility centers, they would benefit from cross-platform accessibility, persistent identifiers (e.g. DOI, or digital object identifier), and provenance tracking. The ability to handle different data access patterns suggests that a combination of low cost, high latency (e.g. for storing ML training sets) and high cost, low latency systems (e.g. for real-time, integrated machine control feedback) may be needed.

A Vision for Coupling Operation of US Fusion Facilities with HPC Systems and the Implications for Workflows and Data Management

Abstract

Researchers

Organizations