Data Libraries and Services Enabling Exascale Science
A widening gap between computational and I/O performance, deeper memory and storage hierarchies with complex organization, and massive levels of parallelism (e.g., billions of cores) are but a few of the challenges to data management at the exascale. These challenges will necessitate significant advances in data management to achieve the requisite level of performance and scalability to enable applications to meet their goals.
In this project, we address these challenges by evolving current best-of-breed technologies in data management: ROMIO, Parallel netCDF, Darshan, and the Mercury suite. These technologies are core components of the ECP software stack and will be relied upon by a large number of ECP applications and software technologies. Specifically, we will provide the following capabilities:
ROMIO -- logging and replay library, scalable collective I/O, and pipelining collective I/O
PnetCDF -- interoperability with other HDF5 and ADIOS libraries, new dispatcher, and parallel data compression
Darshan -- characterization of non-POSIX I/O, optional fine-grained data collection, coordination with HDF5 efforts on internal monitoring, and coordination with the ADIOS team to validate miniapp behavior
Mercury Suite -- integration with ECP-relevant library and application teams, ensuring Mercury portability across exascale platforms, enhancing testing and performance suites, and identifying and filling functionality gaps
Our exascale problem target is the co-design and development of new mechanisms within core data management software, hardening of these mechanisms, and integration with the ECP software stack to enable applications to meet their performance and scalability requirements.