Skip to main content

A High-level Design for Bidirectional Data Streaming to High-Performance Computing Systems from External Science Facilities

Publication Type
ORNL Report
Publication Date

Cutting-edge science is increasingly data-driven due to the emergence of scientific machine learning models that can guide scientists toward fruitful areas of exploration. Experimental science facilities such as light and neutron sources, particle colliders, and radio astronomy telescopes are also producing raw measurement data at rates that exceed available data storage and computing capacity at those facilities. As a result, scientific workflows are being developed that concurrently couple experiments at science facilities with high-performance computing (HPC) facilities to enable analysis of experimental data while the experiment is ongoing, and where analysis results are potentially fed back to the experiment for use in experimental control and/or steering in a time-sensitive manner. Our goal is to design, prototype, and deploy a new capability for the Oak Ridge Leadership Computing Facility (OLCF) that enables such workflows through support for bidirectional, memory-based streaming of data from external experiments into and out of OLCF HPC systems. This high-level design document describes the related work and motivating use cases that inform our understanding of the technical requirements for this capability, and describes a proposed architectural solution that meets these requirements and our plans for demonstrating the capability.