Christian Engelmann Senior Scientist and Group Leader, Intelligent Systems and Facilities Research Contact engelmannc@ornl.gov | 865.574.3132 All Publications INTERSECT Architecture Specification: Use Case Design Patterns (Version 0.9) Resilience Design Patterns: A Structured Approach to Resilience at Extreme Scale (Version 2.0) INTERSECT Architecture Specification: Use Case Design Patterns (Version 0.5) INTERSECT Architecture Specification: Microservice Architecture (Version 0.5)... RDPM: An Extensible Tool for Resilience Design Patterns Modelling... Resiliency in numerical algorithm design for extreme scale simulations... Study of Interconnect Errors, Network Congestion, and Applications Characteristics for Throttle Prediction on a Large Scale H... PLEXUS: A Pattern-Oriented Runtime System Architecture for Resilient Extreme-Scale High-Performance Computing Systems... GPU Lifetimes on Titan Supercomputer: Survival Analysis and Reliability... Models for Resilience Design Patterns... 3D Coded SUMMA: Communication-Efficient and Robust Parallel Matrix Multiplication... Self-stabilizing Connected Components... Concepts for OpenMP Target Offload Resilience... Performance Efficient Multiresilience Using Checkpoint Recovery In Iterative Algorithms A Comprehensive Informative Metric for Analyzing HPC System Status Using the LogSCAN Platform... Analyzing the Impact of System Reliability Events on Applications In the Titan Supercomputer A Big Data Analytics Framework for HPC Log Data: Three Case Studies Using the Titan Supercomputer Log... Machine Learning Models for GPU Error Prediction in a Large Scale HPC System... Understanding and Analyzing Interconnect Errors and Network Congestion on a Large Scale HPC System... Pattern-based Modeling of Multiresilience Solutions for High-Performance Computing... Shrink or Substitute: Handling Process Failures in HPC Systems Using In-Situ Recovery... Pattern-based Modeling of High-Performance Computing Resilience... Failures in Large Scale Systems: Long-term Measurement, Analysis, and Implications... Characterizing Temperature, Power, and Soft-Error Behaviors in Data Center Systems: Insights, Challenges, and Opportunities... Big Data Meets HPC Log Analytics: Scalable Approach to Understanding Systems at Extreme Scale... Pagination Current page 1 Page 2 Page 3 … Next page ›› Last page Last » Key Links Curriculum Vitae Google Scholar ORCID LinkedIn Researcher Website INTERSECT Initiative Organizations Computing and Computational Sciences Directorate Computer Science and Mathematics Division Advanced Computing Systems Research Section Intelligent Systems and Facilities Group