GPU Lifetimes on Titan Supercomputer: Survival Analysis and Reliability Conference Paper November, 2020
US Department of Energy, Office of Science High Performance Computing Facility Operational Assessment 2019 Oak Ridge Leadership Computing Facility ORNL Report June, 2020
Analyzing a Five-Year Failure Record of a Leadership-Class Supercomputer Conference Paper October, 2019
US Department of Energy, Office of Science High Performance Computing Facility Operational Assessment 2018 Oak Ridge Leadership Computing Facility ORNL Report May, 2019
Balancing Performance and Portability with Containers in HPC: An OpenSHMEM Example Conference Paper January, 2019