darshan package

PyDarshan provides direct log access for reading binary Darshan logs. PyDarshan also provides a suite of analysis utilities.

darshan.enable_experimental(verbose=False)[source]

Enable experimental features such as aggregation methods for reports.

Parameters:

verbose (bool) – Display log of enabled features. (Default: True)

Subpackages

Submodules

darshan.discover_darshan module

Auxiliary to discover a darshan-util installation.

exception darshan.discover_darshan.DarshanVersionError(target_version, provided_version, msg='Feature')[source]

Bases: NotImplementedError

Raised when using a feature which is not provided by libdarshanutil.

darshan.discover_darshan.check_version(ffi=None, libdutil=None)[source]

Get version from shared library or pkg-config and return info.

Returns:

Path to a darshan-util installation.

darshan.discover_darshan.discover_darshan_pkgconfig()[source]

Discovers an existing darshan-util installation and returns the appropriate path to a shared object for use with Python’s CFFI.

Returns:

Path to a darshan-util installation.

darshan.discover_darshan.discover_darshan_pyinstaller()[source]

Discovers darshan-util if installed as as part of a pyinstaller bundle.

Returns:

Path to a darshan-util installation.

darshan.discover_darshan.discover_darshan_shutil()[source]

Discovers an existing darshan-util installation and returns the appropriate path to a shared object for use with Python’s CFFI.

Returns:

Path to a darshan-util installation.

darshan.discover_darshan.discover_darshan_wheel()[source]

Discovers darshan-util if installed as as part of the wheel.

Returns:

Path to a darshan-util installation.

darshan.discover_darshan.find_utils(ffi, libdutil)[source]

Try different methods to discover darshan-util:

Precedence: 1) Try if the current environment allows dlopen to load libdarshan-util 2) Try if darshan-parser is exposed via PATH, and attempt loading relative to it. 3) Try if darshan is exposed via pkgconfig 4) Fallback on binary distributed along with Python package

Parameters:
  • ffi – existing ffi instance to use

  • libdutil – reference to libdutil to populate

darshan.discover_darshan.load_darshan_header()[source]

Returns a CFFI compatible header for darshan-utlil as a string.

Returns:

String with a CFFI compatible header for darshan-util.

darshan.report module

The darshan.repport module provides the DarshanReport class for convienient interaction and aggregation of Darshan logs using Python.

class darshan.report.DarshanRecordCollection(mod=None, report=None)[source]

Bases: MutableSequence

Darshan log records may nest various properties (e.g., DXT, Lustre). As such they can not faithfully represented using only a single Numpy array or a Pandas dataframe.

The DarshanRecordCollection is used as a wrapper to offer users a stable API to DarshanReports and contained records in various popular formats while allowing to optimize memory and internal representations as necessary.

append(val)[source]

S.append(value) – append value to the end of the sequence

info(describe=False, plot=False)[source]

Print information about the record for inspection.

Parameters:
  • describe (bool) – show detailed summary and statistics (default: False)

  • plot (bool) – show plots for quick value overview for counters and fcounters (default: False)

Returns:

None

insert(key, val)[source]

S.insert(index, value) – insert value before index

to_df(attach='default')[source]
to_dict()[source]
to_json()[source]
to_list()[source]
to_numpy()[source]
class darshan.report.DarshanReport(filename=None, dtype='numpy', start_time=None, end_time=None, automatic_summary=False, read_all=True, lookup_name_records=True)[source]

Bases: object

The DarshanReport class provides a convienient wrapper to access darshan logs, which also caches already fetched information. In addition to that a number of common aggregations can be performed.

__add__(other)[source]

Allow reports to be merged using the addition operator.

__deepcopy__(memo)[source]

Creates a deepcopy of report.

Note

Needed to purge reference to self.log as Cdata can not be pickled: TypeError: can’t pickle _cffi_backend.CData objects

__del__()[source]

Clean up when deleted or garbage collected (e.g., del-statement)

__enter__()[source]

Satisfy API for use with context manager (e.g., with-statement)

__exit__(type, value, traceback)[source]

Cleanup when used by context manager (e.g., with-statement)

__init__(filename=None, dtype='numpy', start_time=None, end_time=None, automatic_summary=False, read_all=True, lookup_name_records=True)[source]
Parameters:
  • filename (str) – filename to open (optional)

  • dtype (str) – default dtype for internal structures

  • automatic_summary (bool) – automatically generate summary after loading

  • read_all (bool) – whether to read all records for log

  • lookup_name_records (bool) – lookup and update name_records as records are loaded

Returns:

None

_cleanup()[source]

Cleanup when deleting object.

property counters
property heatmaps
info(metadata=False)[source]

Print information about the record for inspection.

Parameters:

metadata (bool) – show detailed metadata (default: False)

Returns:

None

property metadata
mod_read_all_apmpi_records(mod='APMPI', dtype=None, warnings=True)[source]

Reads all APMPI records for provided module.

Parameters:
  • mod (str) – Identifier of module to fetch all records

  • dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary

Returns:

None

mod_read_all_apxc_records(mod='APXC', dtype=None, warnings=True)[source]

Reads all APXC records for provided module.

Parameters:
  • mod (str) – Identifier of module to fetch all records

  • dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary

Returns:

None

mod_read_all_dxt_records(mod, dtype=None, warnings=True, reads=True, writes=True)[source]

Reads all dxt records for provided module.

Parameters:
  • mod (str) – Identifier of module to fetch all records

  • dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary

Returns:

None

mod_read_all_lustre_records(mod='LUSTRE', dtype=None, warnings=True)[source]

Reads all dxt records for provided module.

Parameters:
  • mod (str) – Identifier of module to fetch all records

  • dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary

Returns:

None

mod_read_all_records(mod, dtype=None, warnings=True)[source]

Reads all generic records for module

Parameters:
  • mod (str) – Identifier of module to fetch all records

  • dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary, ‘pandas’

Returns:

None

mod_records(mod, dtype='numpy', warnings=True)[source]

Return generator for lazy record loading and traversal.

Warning

Can’t be used for now when alternating between different modules. A temporary workaround can be to open the same log multiple times, as this way buffers are not shared between get_record invocations in the lower level library.

Parameters:
  • mod (str) – Identifier of module to fetch records for

  • dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary

Returns:

None

property modules
open(filename, read_all=False)[source]

Open log file via CFFI backend.

Parameters:
  • filename (str) – filename to open (optional)

  • read_all (bool) – whether to read all records for log

Returns:

None

read_all(dtype=None)[source]

Read all available records from darshan log and return as dictionary.

Parameters:

None

Returns:

None

read_all_dxt_records(reads=True, writes=True, dtype=None)[source]

Read all dxt records from darshan log and return as dictionary.

Parameters:

None

Returns:

None

read_all_generic_records(counters=True, fcounters=True, dtype=None)[source]

Read all generic records from darshan log and return as dictionary.

Parameters:

None

Returns:

None

read_all_heatmap_records()[source]

Read all heatmap records from darshan log and return as dictionary.

Note

As the module is encoded in a name_record, all heatmap data is read and then exposed through the report.heatmaps property.

Parameters:

None

Returns:

None

read_metadata(read_all=False)[source]

Read metadata such as the job, the executables and available modules.

Parameters:

None

Returns:

None

rebase_timestamps(inplace=False, timebase=False)[source]

Updates all records in the report to use timebase (defaults: start_time). This might allow to conserve memory as reports are merged.

Parameters:
  • records (dict, list) – records to rebase

  • inplace (bool) – weather to merel return a copy or to update records

  • timebase (datetime.datetime) – new timebase to use

Returns:

rebased_records (same type as provided to records)

to_dict()[source]

Return dictionary representation of report data.

Parameters:

None

Returns:

dict

to_json()[source]

Return JSON representation of report data as string.

Parameters:

None

Returns:

JSON String

update_name_records(mod=None)[source]

Update (and prune unused) name records from resolve table.

First reindexes all used name record identifiers and then queries darshan-utils library to compile filtered list of name records.

Parameters:

None

Returns:

None

class darshan.report.DarshanReportJSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: JSONEncoder

Helper class for JSON serialization if the report contains, for example, numpy or dates records, which are not handled by the default JSON encoder.

default(obj)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
exception darshan.report.ModuleNotInDarshanLog[source]

Bases: ValueError

Raised when module is not present in Darshan log.