|Title||Modular HPC I/O Characterization with Darshan |
|Publication Type||Conference Paper |
|Year of Publication||2016 |
|Authors||Snyder, S, Carns, PH, Harms, K, Ross, R, Lockwood, GK, Wright, N |
|Conference Name||ESPT'16 Proceedings of the 5th Workshop on Extreme-Scale Programming Tools |
|Date Published||11/2016 |
|Publisher||IEEE Press |
|Conference Location||Salt Lake City, Utah |
|Other Numbers||ANL/MCS-P6048-0916 |
|Abstract||Contemporary high-performance computing (HPC) applications encompass a broad range of distinct I/O strategies and are often executed on a number of different compute platforms in their lifetime. These large-scale HPC platforms employ increasingly complex I/O subsystems to provide a suitable level of I/O performance to applications. Tuning I/O workloads for such a system is nontrivial, and the results generally are not portable to other HPC systems. I/O profiling tools can help to address this challenge, but most existing tools only instrument specific components within the I/O subsystem that provide a limited perspective on I/O performance. The increasing diversity of scientific applications and computing platforms calls for greater flexibility and scope in I/O characterization.
In this work, we consider how the I/O profiling tool Darshan can be improved to allow for more flexible, comprehensive instrumentation of current and future HPC I/O workloads. We evaluate the performance and scalability of our design to ensure that it is lightweight enough for full-time deployment on production HPC systems. We also present two case studies illustrating how a more comprehensive instrumentation of application I/O workloads can enable insights into I/O behavior that were not previously possible. Our results indicate that Darshan’s modular instrumentation methods can provide valuable feedback to both users and system administrators, while imposing negligible overheads on user applications.