Deploying a High-Performance Filesystem on BGL

If you are looking for results from March 2005, go to pvfs2-200503.html.


This second round of testing used the existing PVFS2 storage volume. We have 12 nodes, providing in aggregate a 1.1 TB PVFS2 volume.

Hardware benchmarks

Each storage node has a RAID array for pvfs2 storage. To get some idea of the performance of the disk subsystem, here is a run from bonnie++.

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
fs2             10G 38476  95 49354  14 23415   5 35808  72 63971   5 557.3   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  3113  99 +++++ +++ +++++ +++  3138  99 +++++ +++  9620 100


The storage, login, and IO nodes are all running a CVS snapshot of PVFS2 from Feb. 16th. Our BlueLight version is "Driver 100" (DRV100_2005-050311PM). The login and storage nodes run SLES 9.

Items of note



mpi-io-test is a simple MPI-IO contiguous access benchmark. Each process writes a large chunk of data to a non-overlapping, non-interleaved region of a file and then reads it back. It reports the aggregate IO performance of all processes involved in the job. We would expect this benchmark to give an upper bound on IO performance.

We ran mpi-io-test across the entire rack, varying the number of compute nodes as well as the amount of data each process wrote to the PVFS2 file. Read performance topped out at over 1.2 GBytes/sec for 1024 nodes, each writing 32 MB chunks. Write performance with the new ciod is more consistent than the earlier tests. Our peak write bandwidth of around 325 MBytes/sec is over twice the peak write bandwith of the February tests.

[PVFS2 read performance on 1024 nodes] [PVFS2 write performance on 1024 nodes]


In coll_perf , the program writes a 3 dimensional array to a file. All processes perform collective IO. Data pending.


Not done yet

Last update:

Wed Apr 20 23:57:06 CDT 2005

Valid HTML 4.01!