Darshan-util installation and usage


Table of Contents

1. Introduction
2. Requirements
3. Compilation and installation
4. Analyzing log files
4.1. darshan-job-summary.pl
4.2. darshan-summary-per-file.sh
4.3. darshan-parser
4.4. Guide to darshan-parser output
4.5. Other command line utilities

1. Introduction

This document describes darshan-util, a collection of tools for parsing and summarizing log files produced by Darshan instrumentation. The darshan-util package can be installed and used on any system regardless of where the logs were originally generated. Darshan log files are platform-independent.

More information about Darshan can be found at the Darshan web site.

2. Requirements

Darshan-util has only been tested in Linux environments, but will likely work in other Unix-like environments as well.

Hard requirements

  • C compiler
  • zlib development headers and library (zlib-dev or similar)

Optional requirements

  • libbz2 development headers and library (libbz2-dev or similar)
  • Perl
  • pdflatex
  • gnuplot 4.2 or later
  • epstopdf

3. Compilation and installation

Configure and build example. 

tar -xvzf darshan-<version-number>.tar.gz
cd darshan-<version-number>/darshan-util
./configure
make
make install

You can specify --prefix to install darshan-util in a specific location (such as in your home directory for non-root installations). See ./configure --help for additional optional arguments, including how to specify alternative paths for zlib and libbz2 development libraries. darshan-util also supports VPATH or "out-of-tree" builds if you prefer that method of compilation.

4. Analyzing log files

Each time a darshan-instrumented application is executed, it will generate a single log file summarizing the I/O activity from that application. See the darshan-runtime documentation for more details, but the log file for a given application will likely be found in a centralized directory, with the path and log file name in the following format:

<YEAR>/<MONTH>/<DAY>/<USERNAME>_<BINARY_NAME>_<JOB_ID>_<DATE>_<UNIQUE_ID>_<TIMING>.darshan.gz

This is a binary format file that summarizes I/O activity. As of version 2.0.0 of Darshan, this file is portable and does not have to be analyzed on the same system that executed the job.

4.1. darshan-job-summary.pl

You can generate a graphical summary of the I/O activity for a job by using the darshan-job-summary.pl graphical summary tool as in the following example:

darshan-job-summary.pl carns_my-app_id114525_7-27-58921_19.darshan.gz

This utility requires Perl, pdflatex, epstopdf, and gnuplot in order to generate its summary. By default, the output is written to a multi-page pdf file based on the name of the input file (in this case it would produce a carns_my-app_id114525_7-27-58921_19.pdf output file). You can also manually specify the name of the output file using the --output argument.

An example of the output produced by darshan-job-summary.pl can be found at http://www.mcs.anl.gov/research/projects/darshan/files/2012/06/pcarns_mpi-io-test_id3406_6-7-47644-13333843235489639491_1.pdf .

4.2. darshan-summary-per-file.sh

This utility is similar to darshan-job-summary.pl, except that it produces a separate pdf summary for every file accessed by an application. It can be executed as follows:

darshan-summary-per-file.sh carns_my-app_id114525_7-27-58921_19.darshan.gz output-dir

The second argument is the name of a directory (to be created) that will contain the collection of pdf files. Note that this utility probably is not appropriate if your application opens a large number of files.

If you would like to produce a summary for a single specific file, then you can run the following command to produce a quick list of the files opened by an application and the amount of time spent performing I/O to each of them:

darshan-parser --file-list carns_my-app_id114525_7-27-58921_19.darshan.gz

Once you have identified a specific file of interest, then you can produce a summary for that specific file with the following commands:

darshan-convert --file HASH carns_my-app_id114525_7-27-58921_19.darshan.gz interesting_file.darshan.gz
darshan-job-summary.pl interesting_file.darshan.gz

The "HASH" argument is the hash of a file name as listed in the darshan-parser --file-list output. The interesting_file.darshan.gz file produced by darshan-convert is like a normal Darshan log file, but it will only contain instrumentation for the specified file.

4.3. darshan-parser

In order to obtained a full, human readable dump of all information contained in a log file, you can use the darshan-parser command line utility. It does not require any additional command line tools. The following example essentially converts the contents of the log file into a fully expanded text file:

darshan-parser carns_my-app_id114525_7-27-58921_19.darshan.gz > ~/job-characterization.txt

The format of this output is described in the following section.

4.4. Guide to darshan-parser output

The beginning of the output from darshan-parser displays a summary of overall information about the job. The following table defines the meaning of each line:

output line description

"# darshan log version"

internal version number of the Darshan log file

"# size of file statistics"

uncompressed size of each file record in the binary log file

"# size of job statistics"

uncompressed size of the overall job statistics in the binary log file

"# exe"

name of the executable that generated the log file

"# uid"

user id that the job ran as

"# jobid"

job id from the scheduler

"# start_time"

start time of the job, in seconds since the epoch

"# start_time_asci"

start time of the job, in human readable format

"# end_time"

end time of the job, in seconds since the epoch

"# end_time_asci"

end time of the job, in human readable format

"# nprocs"

number of MPI processes

"# run time"

run time of the job in seconds

Table of mounted file systems

The next portion of the output shows a table of all general purpose file systems that were mounted while the job was running. Each line uses the following format:

<device> <mount point> <fs type>

The device field is the device ID as reported by the stat() system call. Note that this device ID may change if the node is rebooted or the file system is remounted.

Format of I/O characterization fields

The remainder of the output will show characteristics for each file that was opened by the application. Each line uses the following format:

<rank> <file name hash> <counter name> <counter value> <file name suffix> <mount point> <fs type>

The <rank> column indicates the rank of the process that opened the file. A rank value of -1 indicates that all processes opened the same file. In that case, the value of the counter represents an aggregate across all processes. The <file name hash> is a 64 bit hash of the file path/name that was opened. It is used as a way to uniquely differentiate each file. The <counter name> is the name of the statistic that the line is reporting, while the <counter value> is the value of that statistic. A value of -1 indicates that Darshan was unable to collect statistics for that particular counter, and the value should be ignored. The <file name suffix> shows the last 11 characters of the file name. The <mount point> is the mount point of the file system that this file belongs to. The <fs type> is the type of file system.

I/O characterization fields

The following table shows a list of integer statistics that are available for each file, along with a description of each. Unless otherwise noted, counters include all variants of the call in question, such a read(), pread(), and readv() for CP_POSIX_READS.

output line description

CP_POSIX_READS

Count of POSIX read operations

CP_POSIX_WRITES

Count of POSIX write operations

CP_POSIX_OPENS

Count of how many times the file was opened

CP_POSIX_SEEKS

Count of POSIX seek operations

CP_POSIX_STATS

Count of POSIX stat operations

CP_POSIX_MMAPS

Count of POSIX mmap operations

CP_POSIX_FREADS

Count of stream read operations

CP_POSIX_FWRITES

Count of stream write operations

CP_POSIX_FOPENS

Count of stream open operations

CP_POSIX_FSEEKS

Count of stream seek operations

CP_POSIX_FSYNCS

Count of fsync operations

CP_POSIX_FDSYNCS

Count of fdatasync operations

CP_INDEP_OPENS

Count of non-collective MPI opens

CP_COLL_OPENS

Count of collective MPI opens

CP_INDEP_READS

Count of non-collective MPI reads

CP_INDEP_WRITES

Count of non-collective MPI writes

CP_COLL_READS

Count of collective MPI reads

CP_COLL_WRITES

Count of collective MPI writes

CP_SPLIT_READS

Count of MPI split collective reads

CP_SPLIT_WRITES

Count of MPI split collective writes

CP_NB_READS

Count of MPI non-blocking reads

CP_NB_WRITES

Count of MPI non-blocking writes

CP_SYNCS

Count of MPI file syncs

CP_INDEP_NC_OPENS

Count of independent Parallel NetCDF opens

CP_COLL_NC_OPENS

Count of collective Parallel NetCDF opens

CP_HDF5_OPENS

Count of HDF5 opens

CP_COMBINER_*

Count of each type of MPI datatype (both in memory and in file)

CP_HINTS

Count of MPI file hints used

CP_VIEWS

Count of MPI file views used

CP_MODE

Mode that the file was last opened in

CP_BYTES_READ

Total number of bytes that were read from the file

CP_BYTES_WRITTEN

Total number of bytes written to the file

CP_MAX_BYTE_READ

Highest offset in the file that was read

CP_MAX_BYTE_WRITTEN

Highest offset in the file that was written

CP_CONSEC_READS

Number of consecutive reads (that were immediately adjacent to the previous access)

CP_CONSEC_WRITES

Number of consecutive writes (that were immediately adjacent to the previous access)

CP_SEQ_READS

Number of sequential reads (at a higher offset than where the previous access left off)

CP_SEQ_WRITES

Number of sequential writes (at a higher offset than where the previous access left off)

CP_RW_SWITCHES

Number of times that access toggled between read and write in consecutive operations

CP_MEM_NOT_ALIGNED

Number of times that a read or write was not aligned in memory

CP_MEM_ALIGNMENT

Memory alignment value (chosen at compile time)

CP_FILE_NOT_ALIGNED

Number of times that a read or write was not aligned in file

CP_FILE_ALIGNMENT

File alignment value. This value is detected at runtime on most file systems. On Lustre, however, Darshan assumes a default value of 1 MiB for optimal file alignment.

CP_MAX_READ_TIME_SIZE

Size of the slowest POSIX read operation

CP_MAX_WRITE_TIME_SIZE

Size of the slowest POSIX write operation

CP_SIZE_READ_*

Histogram of read access sizes at POSIX level

CP_SIZE_READ_AGG_*

Histogram of total size of read accesses at MPI level, even if access is noncontiguous

CP_EXTENT_READ_*

Histogram of read extents

CP_SIZE_WRITE_*

Histogram of write access sizes at POSIX level

CP_SIZE_WRITE_AGG_*

Histogram of total size of write accesses at MPI level, even if access is noncontiguous

CP_EXTENT_WRITE_*

Histogram of write extents

CP_STRIDE[1-4]_STRIDE

Size of 4 most common stride patterns

CP_STRIDE[1-4]_COUNT

Count of 4 most common stride patterns

CP_ACCESS[1-4]_ACCESS

4 most common access sizes

CP_ACCESS[1-4]_COUNT

Count of 4 most common access sizes

CP_DEVICE

File system identifier; correlates with mount table shown earlier. In Darshan 2.2.5 and earlier, this is the device ID reported by stat(), in Darshan 2.2.6 and later, this is an opaque identifier generated by Darshan.

CP_SIZE_AT_OPEN

Size of file at first open time

CP_FASTEST_RANK

The MPI rank of the rank with smallest time spent in I/O

CP_FASTEST_RANK_BYTES

The number of bytes transferred by the rank with smallest time spent in I/O

CP_SLOWEST_RANK

The MPI rank of the rank with largest time spent in I/O

CP_SLOWEST_RANK_BYTES

The number of bytes transferred by the rank with the largest time spent in I/O

The following is a list of floating point statistics that are available for each file:

output line description

CP_F_OPEN_TIMESTAMP

Timestamp of first time that the file was opened

CP_F_CLOSE_TIMESTAMP

Timestamp of the last time that the file was closed

CP_F_READ_START_TIMESTAMP

Timestamp that the first read operation began

CP_F_READ_END_TIMESTAMP

Timestamp that the last read operation ended

CP_F_WRITE_START_TIMESTAMP

Timestamp that the first write operation begin

CP_F_WRITE_END_TIMESTAMP

Timestamp that the last write operation ended

CP_F_POSIX_READ_TIME

Cumulative time spent reading at the POSIX level

CP_F_POSIX_WRITE_TIME

Cumulative time spent in write, fsync, and fdatasync at the POSIX level

CP_F_POSIX_META_TIME

Cumulative time spent in open, close, stat, and seek at the POSIX level

CP_F_MPI_META_TIME

Cumulative time spent in open and close at the MPI-IO level

CP_F_MPI_READ_TIME

Cumulative time spent reading at the MPI-IO level

CP_F_MPI_WRITE_TIME

Cumulative time spent write and sync at the MPI-IO level

CP_F_MAX_READ_TIME

Duration of the slowest individual POSIX read operation

CP_F_MAX_WRITE_TIME

Duration of the slowest individual POSIX write operation

CP_F_FASTEST_RANK_TIME

The time of the rank which had the smallest amount of time spent in I/O (CP_F_POSIX_READ_TIME + CP_F_POSIX_WRITE_TIME + CP_F_POSIX_META_TIME)

CP_F_SLOWEST_RANK_TIME

The time of the rank which had the largest amount of time spent in I/O

CP_F_VARIANCE_RANK_TIME

The population variance for I/O time of all the ranks

CP_F_VARIANCE_RANK_BYTES

The population variance for bytes transferred of all the ranks

Additional summary output

Performance

Use the --perf option to get performance approximations using four different computations.

  • agg_perf_by_slowest: Uses the slowest time over all ranks (most accurate but requires newer log version)
  • agg_perf_by_cumul: Sums time counters and divides by nprocs (inaccurate if lots of variance between procs)
  • agg_perf_by_open: The difference between timestamp of open and close (inaccurate if file is left open without i/o happening)
  • agg_perf_by_open_lastio: The difference between timestamp of open and the timestamp of last i/o (similar to above but fixes case where file is left open after io is complete)

Example output. 

# performance
# -----------
# total_bytes: 4260544914638
# slowest_rank_time: 100.615304
# slowest_rank_meta_time: 85.642247
# shared_time_by_cumul: 55.306259
# shared_time_by_open: 65.125876
# shared_time_by_open_lastio: 65.027973
# shared_meta_time: 3.002117
# agg_perf_by_cumul: 26059.078568
# agg_perf_by_open: 24515.164431
# agg_perf_by_open_lastio: 24529.654039
# agg_perf_by_slowest: 26049.207180

Files

Use the --file option to get totals based on file usage. The first column is the count of files for that type, the second column is number of bytes for that type and the third column is the maximum offset accessed.

  • total: All files
  • read_only: Files that were only read from
  • write_only: Files that were only written to
  • unique: Files that were opened on only one rank
  • shared: File that were opened by more than one rank

Example output. 

# files
# -----
# total: 1542 236572244952 154157611
# read_only: 3 133998651 122805519
# write_only: 1539 236438246301 154157611
# read_write: 0 0 0
# unique: 2 11193132 11193063
# shared: 1540 236561051820 154157611

Totals

Use the --total option to get all statistics as an aggregate total. Statistics that make sense to be aggregated are aggregated. Other statistics may be a minimum or maximum if that makes sense. Other data maybe zeroed if it doesn’t make sense to aggregate the data.

Example output. 

total_CP_INDEP_OPENS: 0
total_CP_COLL_OPENS: 196608
total_CP_INDEP_READS: 0
total_CP_INDEP_WRITES: 0
total_CP_COLL_READS: 0
total_CP_COLL_WRITES: 0
total_CP_SPLIT_READS: 0
total_CP_SPLIT_WRITES: 1179648
total_CP_NB_READS: 0
total_CP_NB_WRITES: 0
total_CP_SYNCS: 0
total_CP_POSIX_READS: 983045
total_CP_POSIX_WRITES: 33795
total_CP_POSIX_OPENS: 230918
...

File list

Use the --file-list option to produce a list of files opened by the application along with estimates of the amount of time spent accessing each file.

Example output. 

# Per-file summary of I/O activity.
# <hash>: hash of file name
# <suffix>: last 15 characters of file name
# <type>: MPI or POSIX
# <nprocs>: number of processes that opened the file
# <slowest>: (estimated) time in seconds consumed in IO by slowest process
# <avg>: average time in seconds consumed in IO per process

# <hash>    <suffix>    <type>  <nprocs>    <slowest>   <avg>
17028232952633024488    amples/boom.dat MPI 2   0.000363    0.012262

Detailed file list

The --file-list-detailed is the same as --file-list except that it produces many columns of output containing statistics broken down by file. This option is mainly useful for automated analysis.

4.5. Other command line utilities

The darshan-util package includes a number of other utilies that can be summarized briefly as follows:

  • darshan-convert: converts an existing log file to the newest log format. If the output file has a .bz2 extension, then it will be re-compressed in bz2 format rather than gz format. It also has command line options for anonymizing personal data, adding metadata annotation to the log header, and restricting the output to a specific instrumented file.
  • darshan-diff: compares two darshan log files and shows counters that differ.
  • darshan-analyzer: walks an entire directory tree of Darshan log files and produces a summary of the types of access methods used in those log files.
  • darshan-logutils*: this is a library rather than an executable, but it provides a C interface for opening and parsing Darshan log files. This is the recommended method for writing custom utilities, as darshan-logutils provides a relatively stable interface across different versions of Darshan and different log formats.