Evaluating Performance Portability of Accelerator Programming Models using SPEC ACCEL 1.2 Benchmarks Conference Paper July, 2018
Machine Learning Models for GPU Error Prediction in a Large Scale HPC System Conference Paper June, 2018
Understanding and Analyzing Interconnect Errors and Network Congestion on a Large Scale HPC System Conference Paper June, 2018
SHMEMGraph: Efficient and Balanced Graph Processing Using One-Sided Communication Conference Paper May, 2018
Pattern-based Modeling of Multiresilience Solutions for High-Performance Computing Conference Paper April, 2018
Shrink or Substitute: Handling Process Failures in HPC Systems Using In-Situ Recovery Conference Paper March, 2018