Some Emerging Themes in Data Intensive Sciences in DOE Office of Science

12:00 PM - 01:00 PM
Galen Shipman, Data Systems Architect, CSM Division, Oak Ridge National Laboratory, Oak Ridge
Computational Sciences and Engineering Division
Weinberg Auditorium (Building 4500N, Room I-126)

Email: Raju Vatsavai

Galen Shipman is the Data Systems Architect for the Computing and Computational Sciences Directorate and Director of the Compute and Data Environment for Science at Oak Ridge National Laboratory (ORNL). He is responsible for defining and maintaining an overarching facilities oriented strategy and infrastructure for data storage, data management, and data analysis spanning from research and development to integration, deployment and operations for high-performance and data-intensive computing initiatives at ORNL. His current work includes addressing many of the data challenges of major facilities such as those of the Spallation Neutron Source (Basic Energy Sciences) and major data centers focusing on Climate Science (Biological and Environmental Research). Prior to assuming his new position, Mr. Shipman was the Technology Integration Group Leader for the National Center for Computational Sciences. In this role he led efforts to address gaps within the computational environment in order to improve performance, scalability and productivity and keep the NCCS systems ahead of the technology curve.

Additional Information

Many scientific domains are increasingly dependent on the ability to efficiently capture, integrate, analyze, and steward large volumes of diverse data. Taking materials science as an example, understanding and ultimately designing advanced new materials with complex properties will require the ability to integrate and analyze data from multiple instruments designed to probe complementary ranges of space, time, and energy. These and many other scientific pursuits require data science capabilities that are often distinct from, but complementary to, the computational science resources provided by modern HPC facilities. Elevating these data science capabilities to a scale and level of operational maturity of today’s HPC facilities holds the promise of creating a new environment for scientific discovery. In this talk I will present a set of exemplar data-intensive science use cases and describe some of the compute and data services upon which they rely. I will then present some of our initial work to operationalize these services here at ORNL.


We're always happy to get feedback from our users. Please use the Comments form to send us your comments, questions, and observations.