Darshan version 3.3.0 is now available!

Following up on our recent pre-releases, a new stable release of Darshan 3.3.0 is now available for download. You can get it HERE.

In addition to the new features and bug fixes introduced in 3.3.0 pre-releases, this release marks the first Darshan version with AutoPerf support. AutoPerf implements two additional Darshan instrumentation modules that can provide details on application MPI communication usage and application performance characteristics on Cray XC platforms:

  • APMPI: Instrumentation of over 70 MPI-3 communication routines, providing operation counts, datatype sizes, and timing information for each application MPI rank.
  • APXC: Instrumentation of Cray XC environments to provide network and compute counters of interest, via PAPI.

See darshan-runtime documentation for more details on how to build Darshan with AutoPerf support.

Please report any issues, comments, or questions to us using the Darshan-users mailing list or our GitLab page.

Darshan version 3.3.0-pre2 is now available

We are happy to announce a new pre-release for Darshan 3.3.0 (darshan-3.3.0-pre2). You can download the source HERE.

This release contains a number of new features, bug fixes, and other improvements as detailed below:

  • New PyDarshan Python package for analyzing Darshan log files
    • PyDarshan provides a couple of interfaces to Darshan logs that should allow for easier development of custom Darshan log analysis utilities in Python
    • See the PyDarshan documentation for more details
    • Thanks to Jakob Luettgau (DKRZ) for all of the hard work in contributing this package
  • Bug fixes
    • Modified Lustre module to use a safer method for obtaining Lustre file striping information (based on fgetxattr rather than ioctl)
    • Fixed bug leading to potential deadlock when reducing shared records in MPI programs (known to affect mvapich2)
    • Fixed bug causing errors when using Darshan’s non-MPI mode when Darshan is built with an MPI compiler
    • Disabled DXT’s MPI-IO offset tracking for OpenMPI applications to avoid crashes caused by an OpenMPI bug
    • Fixed various HDF5 module bugs:
      • Fixes for applications using H5S_SELECT_NONE selections resulting in HDF5 error messages
      • Fixes for applications using non-MPIIO VFDs resulting in HDF5 error messages
      • Fixes for potentially incorrect counter values related to common accesses in the H5D module
      • Other fixes allowing usage of the HDF5 modules in serial HDF5 applications
  • Other enhancements
    • Added support for querying Lustre file striping statistics for Lustre files that are symlinked from other file systems
    • Added support for instrumenting openat, preadv, preadv2, pwritev, and pwritev2 functions, improving instrumentation of OpenMPI applications
    • Improved error messages and documentation for darshan-util tools, including handling of incomplete Darshan log files
    • Added new H5D module counter indicating the Darshan record ID of the file an HDF5 dataset belongs to

As always, please report any issues, comments, or questions to us using the Darshan-users mailing list or our GitLab page.

Darshan 3.2.1 bugfix release available

Due to a reported bug in last week’s 3.2.0 release of Darshan, we have decided to quickly release Darshan 3.2.1 for our users. It is available for download here.

This bugfix is somewhat critical, particularly in production environments, as it is can lead to corrupted Darshan log file data and, potentially, application crashes (though we have not triggered any crashes in our testing). The issue was originally detected by noticing bogus values in the COMMON_ACCESS counters reported by the POSIX, MPIIO, and H5 modules.

In any case, we highly recommend any 3.2.0 users upgrade to this version to avoid any potential for crashes or corrupted Darshan log file data.

Please report any additional questions, issues, or concerns using the Darshan-users mailing list, or by opening an issue on the Darshan GitLab page.

Darshan version 3.2.0 is now officially available

Darshan 3.2.0 is now available for download here.

This release contains a number of new features, bug fixes, and other changes to Darshan. Some of the more notable changes that may be of interest to users:

  • Added detailed instrumentation of HDF5 file (H5F) and dataset (H5D) interfaces.
    • Must be explicitly enabled by passing “–enable-hdf5-mod=/path/to/hdf5/install” when configuring Darshan.
    • Due to ABI incompatibility from HDF5 version 1.8.x -> 1.10.x, special care must be taken to ensure users do not link applications with HDF5 versions that are incompatible with the version the Darshan library was built with (i.e., both HDF5 library versions must be either >=1.10 or <1.10). Using two incompatible HDF5 versions will lead to either link or runtime failures.
    • Support only intended for HDF5 versions 1.8.0+.
  • Added new feature allowing for instrumentation of non-MPI applications.
    • Darshan no longer strictly requires that instrumented applications use MPI, extending coverage to a breadth of new contexts.
    • Note that this feature is only functional in dynamic linking use cases.
    • Thanks to Glenn Lockwood (NERSC) for his help in implementing/testing this feature.
  • Added MPI-IO offset information to Darshan’s DXT tracing mechanism.
  • Updated Darshan compiler wrappers and Cray software modules to transparently and uniformly support dynamic and static linking cases. These methods previously only supported static linking uses cases.
  • Re-implemented Darshan’s PMPI/MPI wrappers to help avoid deadlock with other monitoring tools that rely on PMPI.
  • Added new “–log-path” option to darshan-config utility to allow users to more easily query the directory Darshan logs are stored in.

Please review darshan-runtime and darshan-util documentation for more details on the new HDF5 instrumentation module and the experimental non-MPI instrumentation mechanism. Additionally, consult the ChangeLog in the top-level of the source for a full list of changes associated with this release.

Note that we are currently aware of and looking into a couple of issues related to Lustre file systems that have been reported by Darshan users:

  • Crashes in Darshan’s Lustre module in newer Lustre versions (2.11.x in one reported case). Typically results in additional errors stating: “using old ioctl(LL_IOC_LOV_GETSTRIPE)”.
    • If you experience this problem with Darshan, a temporary workaround would be to just disable the Lustre module — this can only be done at configure time by passing “–disable-lustre-mod”.
  • Floating point exceptions or other warnings related to dividing by zero when writing Darshan log to a Lustre file system (at Darshan shutdown time).
    • We are still working out what combinations of MPI and Lustre libraries exhibit this problem, but a simple workaround in the time being is to run the command “export DARSHAN_LOGHINTS=” before running your application.

We hope to resolve these bugs quickly and intend to release an updated version of Darshan once they are.

Please report any additional questions, issues, or concerns using the Darshan-users mailing list, or by opening an issue on the Darshan GitLab page.

New experimental version of Darshan available for instrumenting non-MPI applications

An experimental pre-release of Darshan is now available that enables instrumentation of non-MPI workloads. It can be downloaded here. It is NOT recommended to use this version in production until we have had more time for users to test it.

See the darshan-runtime documentation (located in darshan-runtime/docs from the top-level Darshan repo) for more information on how to build Darshan without MPI support and also how to enable non-MPI instrumentation at application runtime.

Note that this instrumentation method only works on dynamically-linked executables — Darshan still does not support instrumentation of statically-linked non-MPI executables.

We encourage users that are interested in characterizing I/O in non-MPI contexts to try out this new functionality and let us know about any issues or questions you might have! Depending on user experience, we will try to get a release of this software suitable for production deployment soon.

Darshan at SC19 recap

In case you missed any of it, here’s a list of Darshan-related activities from SC that maybe of interest to the community:

Darshan version 3.1.8 now available

Darshan 3.1.8 is now available for download here.

This release introduces a new trace triggering mechanism that allows users to specify triggers that dictate which files are traced using Darshan’s tracing module, DXT. Users just need to provide Darshan a configuration file describing the triggers and Darshan will decide at runtime which files to store trace data for. Types of triggers include file- and rank-based triggers (based on regex patterns), as well as file access characteristics triggers (to trace based on frequency of small or unaligned I/O accesses). Please refer to darshan-runtime documentation on the DXT module for more details.

Note that full tracing is disabled by default in Darshan and this release does not change that — this is just a mechanism to allow DXT users more control over tracing.

Please report any questions, issues, or concerns using the Darshan-users mailing list, or by opening an issue on the Darshan GitLab page.

Darshan 3.1.7 release is now available

Darshan version 3.1.7 is now available for release HERE! This version addresses a few bug fixes in the prior Darshan release and also contains a couple of new features:

  • Bug fix in handling of DXT module data in the darshan-convert utility
    • Reported by Mahzad Khoshlessan
  • Bug fix in darshan-parser backwards compatibility: Darshan logs generated by Darshan versions prior to 3.1.0 may have included STDIO counters that were not properly up-converted
    • Reported by Teng Wang
  • Bug fix to MiB reported in I/O performance estimate of darshan-job-summary when both POSIX and STDIO data present
    • Reported/fixed by Glenn Lockwood
  • Added Darshan wrapper for ‘__open_2()’ call, needed for properly instrumenting open operations with some versions of gcc/glibc
    • Reported by Cormac Garvey
  • Added an instrumentation module for the MDHIM key/val storage system
  • Added support for properly handling ‘rename()’, ‘dup()’, ‘fileno()’, and ‘fdopen()’ operations in Darshan

Please report any questions, issues, or concerns using the Darshan-users mailing list, or by opening an issue on the Darshan GitLab page.