Application health monitoring for extreme‐scale resiliency using cooperative fault management... Journal July, 2019
An evaluation of the state of time synchronization on leadership class supercomputers Journal October, 2017
Resilience Design Patterns: A Structured Approach to Resilience at Extreme Scale Journal September, 2017
A New Deadlock Resolution Protocol and Message Matching Algorithm for the Extreme-scale Simulator Journal August, 2016
Scaling To A Million Cores And Beyond: Using Light-Weight Simulation to Understand The Challenges Ahead On The Road To Exasca... Journal January, 2014