A Visual Network Analysis Method for Large Scale Parallel I/O Systems
|Title||A Visual Network Analysis Method for Large Scale Parallel I/O Systems|
|Publication Type||Conference Paper|
|Year of Publication||2012|
|Authors||Sigovan, C, Muelder, C, Ma, K, Cope, J, Iskra, K, Ross, RB|
|Conference Name||International Parallel and Distributed Processing Symposium (IPDPS 2013)|
Parallel applications rely on I/O to load data, store end results, and protect partial results from being lost to system failure. Parallel I/O performance thus has a direct and significant impact on application performance. Because supercomputer I/O systems are large and complex, one cannot directly analyze their activity traces. While several visual or automated analysis tools for large-scale HPC log data exist, analysis research in the high-performance computing field is geared toward computation performance rather than I/O performance. Additionally, existing methods usually do not capture the network characteristics of HPC I/O systems. We present a visual analysis method for I/O trace data that takes into account the fact that HPC I/O systems can be represented as networks. We illustrate performance metrics in a way that facilitates the identification of abnormal behavior or performance problems. We demonstrate our approach on I/O traces collected from existing systems at different scales.