Integrating Parallel File I/O and Database Support for High-Performance Scientific Data Management

TitleIntegrating Parallel File I/O and Database Support for High-Performance Scientific Data Management
Publication TypeReport
Year of Publication2000
AuthorsNo, J, Thakur, R, Choudhary, A
Date Published03/2000
Other NumbersANL/MCS-P798-0300
Abstract

Many scientific applications have large I/O requirements, in terms of both the size of data and the number of files or data sets. Management, storage, efficient access, and analysis of this data present an extremely challenging task. Traditionally, two different solutions are used for this problem: file I/O or databases. File I/O can provide high performance but is tedious to use withi large numbers of files and large and complex data sets. Databases can be convenient, flexible, and powerful but do ot perform and scale well for parallel supercomputing applications. We have developed a software system, called Scientific Data Manager (SDM), that combines the good features of both file I/O and databases. SDM provides a thin layer of database-like functionality on top of a high-performance, parallel file-I/O interface (MPI-IO). As a result, users can access data with the convenience of databases and the performance of MPI-IO, without having to bother with the details of either. In this paper, we describe the design and implementation of SDM. With the help of two parallel application templates, ASTRO3D and an Euler solver, we illustrate how some of the design criteria affect performance.

PDFhttp://www.mcs.anl.gov/papers/P798.ps.Z