Argonne National Laboratory

Integrated In-System Storage Architecture for High Performance Computing

TitleIntegrated In-System Storage Architecture for High Performance Computing
Publication TypeConference Paper
Year of Publication2012
AuthorsKimpe, D, Mohror, K, Moody, A, Van Essen, B, Gokhale, M, Ross, RB, de Supinski, BR
Conference NameROSS '12
Other NumbersANL/MCS-P2092-0512

In-system solid state storage is expected to be an important component of the I/O subsystem on the rst exascale plat-forms, as it has the potential to reduce DRAM requirements, increase system reliability, and even out I/O load peaks. This paper describes the design of a prototype, integrated in-system storage architecture we are developing to serve the diverse needs of high performance computing. We are developing a container abstraction to perform lightweight management of in-system storage devices, as well as methods to access containers remotely and transfer them within the storage hierarchy. We are also working on a storage hierarchy abstraction API to provide portable HPC I/O software with the critical information on the configuration of the system it is running on. As currently available large-scale HPC systems lack in-system storage, we are developing a solid state storage simulator backed by DRAM. These efforts are being integrated around an I/O-intensive workload provided by the scalable checkpoint/restart (SCR) library. We are hoping that once complete, our efforts with reduce the overheads of checkpointing and data movement across the system and thus improve the scalability and reliability of HPC applications.