darshan package

PyDarshan provides direct log access for reading binary Darshan logs. PyDarshan also provides a suite of analysis utilities.

darshan.enable_experimental(verbose=False)[source]

Enable experimental features such as aggregation methods for reports.

Parameters:: verbose (bool) – Display log of enabled features. (Default: True)

Subpackages

Submodules

darshan.discover_darshan module

Auxiliary to discover a darshan-util installation.

exception darshan.discover_darshan.DarshanVersionError(target_version, provided_version, msg='Feature')[source]

Bases: NotImplementedError

Raised when using a feature which is not provided by libdarshanutil.

darshan.discover_darshan.check_version(ffi=None, libdutil=None)[source]

Get version from shared library or pkg-config and return info.

Returns:: Path to a darshan-util installation.

darshan.discover_darshan.discover_darshan_pkgconfig()[source]

Discovers an existing darshan-util installation and returns the appropriate path to a shared object for use with Python’s CFFI.

Returns:: Path to a darshan-util installation.

darshan.discover_darshan.discover_darshan_pyinstaller()[source]

Discovers darshan-util if installed as as part of a pyinstaller bundle.

Returns:: Path to a darshan-util installation.

darshan.discover_darshan.discover_darshan_shutil()[source]

Discovers an existing darshan-util installation and returns the appropriate path to a shared object for use with Python’s CFFI.

Returns:: Path to a darshan-util installation.

darshan.discover_darshan.discover_darshan_wheel()[source]

Discovers darshan-util if installed as as part of the wheel.

Returns:: Path to a darshan-util installation.

darshan.discover_darshan.find_utils(ffi, libdutil)[source]

Try different methods to discover darshan-util:

Precedence: 1) Try if the current environment allows dlopen to load libdarshan-util 2) Try if darshan-parser is exposed via PATH, and attempt loading relative to it. 3) Try if darshan is exposed via pkgconfig 4) Fallback on binary distributed along with Python package

Parameters:

ffi – existing ffi instance to use
libdutil – reference to libdutil to populate

darshan.discover_darshan.load_darshan_header()[source]

Returns a CFFI compatible header for darshan-utlil as a string.

Returns:: String with a CFFI compatible header for darshan-util.

darshan.log_utils module

Module for log handling utilities.

darshan.log_utils._locate_log(filename: str, project: str) → str | None[source]: Locates a log in a project.

darshan.log_utils.get_log_path(filename: str) → str[source]: Utility function for locating logs either locally or in the darshan-logs repo.

Parameters

filename: filename of Darshan log to locate in PyDarshan or darshan-logs repo.

Returns

log_path: absolute path to the darshan log matching the input filename.

Raises

FileNotFoundError: if a log cannot be found that matches the input filename.

Notes

If used in the context of a pytest run, pytest.skip() will be used if both A) a local log cannot be found, and B) the darshan-logs repo is unavailable.

This function should never be used in a pytest decorator/ parametrization mark. While it is possible for the function to retrieve log file paths in such scenarios, the imperative skip is not tolerated at collection time when the logs repo is absent.

darshan.report module

The darshan.repport module provides the DarshanReport class for convienient interaction and aggregation of Darshan logs using Python.

class darshan.report.DarshanRecordCollection(mod=None, report=None)[source]

Bases: MutableSequence

Darshan log records may nest various properties (e.g., DXT, Lustre). As such they can not faithfully represented using only a single Numpy array or a Pandas dataframe.

The DarshanRecordCollection is used as a wrapper to offer users a stable API to DarshanReports and contained records in various popular formats while allowing to optimize memory and internal representations as necessary.

append(val)[source]: S.append(value) – append value to the end of the sequence

info(describe=False, plot=False)[source]

Print information about the record for inspection.

Parameters:

describe (bool) – show detailed summary and statistics (default: False)
plot (bool) – show plots for quick value overview for counters and fcounters (default: False)

Returns:

None

insert(key, val)[source]: S.insert(index, value) – insert value before index

to_df(attach='default')[source]

to_dict()[source]

to_json()[source]

to_list()[source]

to_numpy()[source]

class darshan.report.DarshanReport(filename=None, dtype='numpy', start_time=None, end_time=None, automatic_summary=False, read_all=True, lookup_name_records=True)[source]

Bases: object

The DarshanReport class provides a convienient wrapper to access darshan logs, which also caches already fetched information. In addition to that a number of common aggregations can be performed.

__add__(other)[source]: Allow reports to be merged using the addition operator.

__deepcopy__(memo)[source]: Creates a deepcopy of report.

Note

Needed to purge reference to self.log as Cdata can not be pickled: TypeError: can’t pickle _cffi_backend.CData objects

__del__()[source]: Clean up when deleted or garbage collected (e.g., del-statement)

__enter__()[source]: Satisfy API for use with context manager (e.g., with-statement)

__exit__(type, value, traceback)[source]: Cleanup when used by context manager (e.g., with-statement)

__init__(filename=None, dtype='numpy', start_time=None, end_time=None, automatic_summary=False, read_all=True, lookup_name_records=True)[source]

Parameters:

filename (str) – filename to open (optional)
dtype (str) – default dtype for internal structures
automatic_summary (bool) – automatically generate summary after loading
read_all (bool) – whether to read all records for log
lookup_name_records (bool) – lookup and update name_records as records are loaded

Returns:

None

_cleanup()[source]: Cleanup when deleting object.

property counters

property heatmaps

info(metadata=False)[source]

Print information about the record for inspection.

Parameters:: metadata (bool) – show detailed metadata (default: False)
Returns:: None

property metadata

mod_read_all_apmpi_records(mod='APMPI', dtype=None, warnings=True)[source]

Reads all APMPI records for provided module.

Parameters:

mod (str) – Identifier of module to fetch all records
dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary

Returns:

None

mod_read_all_apxc_records(mod='APXC', dtype=None, warnings=True)[source]

Reads all APXC records for provided module.

Parameters:

mod (str) – Identifier of module to fetch all records
dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary

Returns:

None

mod_read_all_dxt_records(mod, dtype=None, warnings=True, reads=True, writes=True)[source]

Reads all dxt records for provided module.

Parameters:

mod (str) – Identifier of module to fetch all records
dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary

Returns:

None

mod_read_all_lustre_records(mod='LUSTRE', dtype=None, warnings=True)[source]

Reads all dxt records for provided module.

Parameters:

mod (str) – Identifier of module to fetch all records
dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary

Returns:

None

mod_read_all_records(mod, dtype=None, warnings=True)[source]

Reads all generic records for module

Parameters:

mod (str) – Identifier of module to fetch all records
dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary, ‘pandas’

Returns:

None

mod_records(mod, dtype='numpy', warnings=True)[source]

Return generator for lazy record loading and traversal.

Warning

Can’t be used for now when alternating between different modules. A temporary workaround can be to open the same log multiple times, as this way buffers are not shared between get_record invocations in the lower level library.

Parameters:

mod (str) – Identifier of module to fetch records for
dtype (str) – ‘numpy’ for ndarray (default), ‘dict’ for python dictionary

Returns:

None

property modules

open(filename, read_all=False)[source]

Open log file via CFFI backend.

Parameters:

filename (str) – filename to open (optional)
read_all (bool) – whether to read all records for log

Returns:

None

read_all(dtype=None)[source]

Read all available records from darshan log and return as dictionary.

Parameters:: None –
Returns:: None

read_all_dxt_records(reads=True, writes=True, dtype=None)[source]

Read all dxt records from darshan log and return as dictionary.

Parameters:: None –
Returns:: None

read_all_generic_records(counters=True, fcounters=True, dtype=None)[source]

Read all generic records from darshan log and return as dictionary.

Parameters:: None –
Returns:: None

read_all_heatmap_records()[source]

Read all heatmap records from darshan log and return as dictionary.

Note

As the module is encoded in a name_record, all heatmap data is read and then exposed through the report.heatmaps property.

Parameters:: None –
Returns:: None

read_metadata(read_all=False)[source]

Read metadata such as the job, the executables and available modules.

Parameters:: None –
Returns:: None

rebase_timestamps(inplace=False, timebase=False)[source]

Updates all records in the report to use timebase (defaults: start_time). This might allow to conserve memory as reports are merged.

Parameters:

records (dict, list) – records to rebase
inplace (bool) – weather to merel return a copy or to update records
timebase (datetime.datetime) – new timebase to use

Returns:

rebased_records (same type as provided to records)

to_dict()[source]

Return dictionary representation of report data.

Parameters:: None –
Returns:: dict

to_json()[source]

Return JSON representation of report data as string.

Parameters:: None –
Returns:: JSON String

update_name_records(mod=None)[source]

Update (and prune unused) name records from resolve table.

First reindexes all used name record identifiers and then queries darshan-utils library to compile filtered list of name records.

Parameters:: None –
Returns:: None

class darshan.report.DarshanReportJSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: JSONEncoder

Helper class for JSON serialization if the report contains, for example, numpy or dates records, which are not handled by the default JSON encoder.

default(obj)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

exception darshan.report.ModuleNotInDarshanLog[source]

Bases: ValueError

Raised when module is not present in Darshan log.

darshan package

Subpackages

Submodules

darshan.discover_darshan module

darshan.log_utils module

Parameters

Returns

Raises

Notes

darshan.report module