Publications
F. Isaila, J. G. Blas, J. Carretero, R. Latham, and R. Ross, "Design and Evaluation of Multiple Level Data Staging for Blue Gene Systems," Preprint ANL/MCS-P1789-0910, September 2010. [pdf]
Parallel applications currently suffer from a significant
imbalance between computational power and available I/O
bandwidth. Additionally, the hierarchical organization of current
Petascale systems contributes to an increase of the I/O subsystem
latency. In these hierarchies, file access involves pipelining data
through several networks with incremental latencies and higher
probability of congestion. Future Exascale systems are likely to
share this trait.
This paper presents a scalable parallel I/O software system
designed to transparently hide the latency of file system accesses
to applications on these platforms. Our solution takes advantage of the hierarchy of networks involved in file accesses, to maximize the degree of overlap between computation, file I/O-related communication and file system access. We describe and evaluate a two-level hierarchy for Blue Gene systems consisting of client-side and I/O node-side caching. Our file cache management modules coordinate the data staging between application and storage through the Blue Gene networks. The experimental results demonstrate that our architecture achieves significant performance improvements through a high degree of overlap between computation, communication, and file I/O.
