Publication on I/O workload modeling using Darshan

A publication on modeling HPC I/O workloads using a variety of workload sources, including Darshan logs, has been accepted at PMBS 2015. This work utilizes the CODES exascale storage system simulation toolkit to analyze the merits of using distinct workload sources (I/O traces, synthetic I/O descriptions, and Darshan I/O characterizations) as input to storage system simulations, as well as replays of I/O workloads on real HPC systems. Documentation on how to configure the CODES workload generator to use Darshan logs as input is given in the CODES repository.

This post will be updated with a link to the paper once the final version is made available online.

Darshan 3.0.0 pre-release

There is a pre-release available for download today for a new experimental version of Darshan! You can download it here.

Darshan 3.0.0 provides much of the same functionality of prior versions, but with the following new enhancements and features:
* hooks for developers to add their own instrumentation module
implementations to capture new I/O characterization data
– these instrumentation modules can be used to instrument new
I/O interfaces or gather system-specific parameters, for instance
* modularized log format allows new module-specific utilities to
access their I/O characterization data independently
– this new format also allows new counters to be added to existing
instrumentation modules without breaking existing utilities
* Darshan logs now contain a mapping of Darshan’s unique record
identifiers to full file names, instead of fix-sized file name
suffixes
* a new instrumentation module for capturing BG/Q-specific parameters
(BG/Q environment is automatically detected at configure time)
* new darshan-parser and darshan-job-summary output to utilize the
new modularized log format

This version is not intended for use in a production environment, yet, so please just use for testing for now. We have had success in the cross-platform testing we have done so far, but more thorough testing is necessary before we are ready for to officially release the new modularized implementation. Issues can be reported directly to the mailing list or using the new issue tracking feature at the Darshan GitLab page (https://xgitlab.cels.anl.gov/darshan/darshan).

As always, we welcome any feedback you may have!

Relevant docs:
– darshan-runtime installation and usage: darshan-runtime
– darshan-util installation and usage: darshan-util
– docs on new modularized architecture and how to add new instrumentation modules: darshan modularization

Darshan repository migration

The Darshan source code, issue tracking, and milestone management has been moved to the Darshan gitlab page at ANL.  You can find instructions there for how to clone the repository.

If you have an existing clone of the old Darshan repository, then you can connect it to the new repository with the following command:

git remote set-url origin https://xgitlab.cels.anl.gov/darshan/darshan.git

If you are interested in contributing to Darshan, the gitlab page brings several new features to the project including the ability for external users to sign up for accounts, simplified forking and pull requests, and the ability to track issues and milestones.

Darshan 2.3.1 release

Darshan version 2.3.1 is now available for download.  This release includes several bug fixes and enhancements, including improvements for Cray, IBM Blue Gene, and Linux cluster environments.   One of the most notable new features is that Darshan now supports the MPICH profiling configuration file method for instrumentation, which simplifies installation on some platforms.  This release also includes a new regression testing framework.

The full changelog since 2.3.0:

darshan-2.3.1
=============
* added documentation and example configuration files for using the -profile or $MPICC_PROFILE hooks to add instrumentation to MPICH-based MPI implementations without generating custom wrapper scripts
* Add wrappers for mkstemp(), mkostemp(), mkstemps(), and mkostemps() (reported by Tom Peterka)
* Change OPEN_TIMESTAMP field to report timestamp right before open() is invoked rather than after timestamp after open is completed. NOTE: updated log format version to 2.06 to reflect this change.
* Change start_time and end_time fields in job record to use min and max (respectively) across all ranks
* Fix bug in write volume data reported in file system table in darshan-job-summary.pl (reported by Matthieu Dorier)
* Clean up autoconf test for zlib and make zlib mandatory (reported by Kalyana Chadalavada)
* add –start-group and –end-group notation to Darshan libraries for Cray PE 2.x environment to fix link-time corner cases (Yushu Yao)
* improve y axis labels on time interval graphs in darshan-job-summary.pl (reported by Tom Peterka)
* misc. improvements to darshan-parser –perf output (reported by Shane Snyder)
– indicate which rank was slowest in unique file results
– label I/O vs. meta time more clearly
– include unique file meta time in agg_perf_by_slowest calculation
* added regression test script framework in darshan-test/regression/
– currently support platforms include:
– Linux with static linking and generated compiler wrappers
– Linux with static linking and profiler configuration files
– Linux with dynamic linking and LD_PRELOAD
– Blue Gene/Q with static linking and profiler configuration files
* update darshan-gen-fortran.pl and darshan-gen-cxx.pl to support new library naming conventions in MPICH 3.1.1 and higher
* update documentation to reflect known issues with some versions of MPICH
* Cray platforms: modify darshan-runtime so that link-time instrumentation options are only used when statically linking via Libs.private. (reported by Kalyana Chadalavada)

Darshan 2.3.1-pre2 experimental release

Darshan 2.3.1-pre2 is now available on the download page for testing and feedback.  The most significant changes since 2.3.1-pre1 are better support for recent MPICH releases, examples and documentation for the -profile hooks in MPICH, and the addition of a regression testing framework.  Our plan is to feature freeze at this point and focus on platform testing for the final 2.3.1 release.

The full changelog since 2.3.0 is as follows:

darshan-2.3.1-pre2
=============
* added documentation and example configuration files for using the -profile or $MPICC_PROFILE hooks to add instrumentation to MPICH-based MPI
implementations without generating custom wrapper scripts
* Add wrappers for mkstemp(), mkostemp(), mkstemps(), and mkostemps() (reported by Tom Peterka)
* Change OPEN_TIMESTAMP field to report timestamp right before open() is invoked rather than after timestamp after open is completed. NOTE: updated log format version to 2.06 to reflect this change.
* Change start_time and end_time fields in job record to use min and max (respectively) across all ranks
* Fix bug in write volume data reported in file system table in darshan-job-summary.pl (reported by Matthieu Dorier)
* Clean up autoconf test for zlib and make zlib mandatory (reported by Kalyana Chadalavada)
* add –start-group and –end-group notation to Darshan libraries for Cray PE 2.x environment to fix link-time corner cases (Yushu Yao)
* improve y axis labels on time interval graphs in darshan-job-summary.pl (reported by Tom Peterka)
* misc. improvements to darshan-parser –perf output (reported by Shane Snyder)
– indicate which rank was slowest in unique file results
– label I/O vs. meta time more clearly
– include unique file meta time in agg_perf_by_slowest calculation
* added regression test script framework in darshan-test/regression/
– workstation-static and workstation-dynamic test environments supported
* update darshan-gen-fortran.pl and darshan-gen-cxx.pl to support new library naming conventions in MPICH 3.1.1 and higher
* update documentation to reflect known issues with some versions of MPICH

Darshan 2.3.1-pre1 experimental release

Darshan 2.3.1-pre1 is now available for download, and the release changelog is listed below. Please let us know if you have any feedback or suggestions. We’ll be working to turn this into a stable release in the coming weeks.

darshan-2.3.1-pre1


  • Add wrappers for mkstemp(), mkostemp(), mkstemps(), and mkostemps() (reported by Tom Peterka)
  • Change OPEN_TIMESTAMP field to report timestamp right before open() is invoked rather than after timestamp after open is completed.
    NOTE: updated log format version to 2.06 to reflect this change.
  • Change start_time and end_time fields in job record to use min and max (respectively) across all ranks
  • Fix bug in write volume data reported in file system table in darshan-job-summary.pl (reported by Matthieu Dorier)
  • Clean up autoconf test for zlib and make zlib mandatory (reported by Kalyana Chadalavada)
  • add –start-group and –end-group notation to Darshan libraries for Cray PE 2.x environment to fix link-time corner cases (Yushu Yao)
  • improve y axis labels on time interval graphs in darshan-job-summary.pl (reported by Tom Peterka)
  • misc. improvements to darshan-parser –perf output (reported by Shane Snyder)
    • indicate which rank was slowest in unique file results
    • label I/O vs. meta time more clearly
    • include unique file meta time in agg_perf_by_slowest calculation

Upcoming Darshan events, Fall 2014

  • October 26, 2014, Raleigh NC: “Darshan – I/O Workload Characterization for MPI Applications” tutorial to be held at IISWC 2014.  The presenters are Yushu Yao of Lawrence Berkeley National Laboratory and Phil Carns of Argonne National Laboratory.  For more information see the tutorial website (http://www.mcs.anl.gov/research/projects/darshan/tutorials/iiswc2014/) or the conference web site (http://www.iiswc.org/iiswc2014/index.html).
  • November 20, 2014, New Orleans LA: “Analyzing Parallel I/O” BOF at SC 2014.  This BOF will include a discussion of multiple I/O instrumentation tools, including Darshan.  The session leaders are Julian Kunkel (German Climate Computing Center), Phil Carns (Argonne National Laboratory), and Alvaro Aguilera (Technical University Dresden).  For more information see the SC BOF website (http://sc14.supercomputing.org/schedule/event_detail?evid=bof121).