Abstract
We present our experience using containers to scale up a massive ensemble of coupled I/O bound workloads on the NERSC Cori supercomputer. We describe the design of a hierarchical simulation structure using the Integrated Plasma Simulator (IPS) that enables the flexible execution of coupled simulations at the system, node, and core level using the same coupling abstraction and API. The hierarchical design allows for the node-level execution to be efficiently executed using containers while not impacting the structure of the simulation at the system level. We demonstrate the viability of the approach by presenting experimental results from applications in coupled fusion plasma simulations that illustrate the performance impact of using containers to deploy the node-level workloads, in conjunction with the user mountable XFS file systems to ameliorate the load on the Lustre parallel file system. We also present results from production runs showing the ability of the ensemble simulations to scale to hundreds of Cori Haswell nodes, with little or no overhead.