Approaching the Final Frontier: Lessons Learned from the Deployment of HPE/Cray EX Spock and Crusher supercomputers Conference Paper May, 2022
A Step Towards the Final Frontier: Lessons Learned from Acceptance Testing of the First HPE/Cray EX 3000 System at ORNL Conference Paper May, 2021
GPU Lifetimes on Titan Supercomputer: Survival Analysis and Reliability Conference Paper November, 2020
US Department of Energy, Office of Science High Performance Computing Facility Operational Assessment 2019 Oak Ridge Leadership Computing Facility ORNL Report June, 2020
Analyzing a Five-Year Failure Record of a Leadership-Class Supercomputer Conference Paper October, 2019
The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems Conference Paper November, 2018