Next: Layers of Parallel Programs
Up: Performance Visualization for Parallel
Previous: Introduction
A family of tools for helping with this problem is just beginning to evolve.
Unlike profiling tools such as Gauge[5], these
tools try to capture the precise sequence of events occurring during program
execution as opposed to counting those events. A minimal amount of data about
each event is captured in a log file of some kind, and the log is then
examined in post-mortem fashion. Real-time display of events being logged is
possible, but usually unproductive, because the subsequences one is interested
in occur so rapidly.
Implementation of such tools raises a number of issues:
-
The log must be captured with an absolutely minimal impact on the
performance of the program. Otherwise the insights gathered by examining
the log will not really apply to the production version of the program.
This means buffering of events in memory, dumping to external storage
without stopping execution, and little or no forced synchronization among
multiple processes.
-
The precise nature of the events to be logged is still very much a matter
for debate. Different systems will have different ``critical'' system
events, and of course different user programs will have different events
altogether. Logging of all low-level events, such as all locking and
unlocking operations, or all messages sent and received, while sometimes
useful, may swamp the logging mechanism without providing real insight
into parallel program behavior. Flexibility in specifying which events
are to be logged is crucial.
-
Given a mechanism for efficient logging and a decision about just what
events should be logged, it remains to find a display mechanism that
promotes reconstruction of the sequence of events and an understanding of
how it was caused by the program specification. Both static and dynamic
displays have been used, and each approach has its advantages. The fact
that we are focusing here on parallel programs almost mandates a graphics
rather than a text display. Of course implementation of such a program
currently involves one in decisions about graphics languages, window
systems, etc.
Many researchers are taking up these challenges. One of the most advanced
systems in this category is Paragraph[6], a logfile
display program developed at Oak Ridge National Laboratory. In general
Paragraph provides more views of logifle information than the tools described
here, although these systems provide more depth in the veiws they do provide.
Next: Layers of Parallel Programs
Up: Performance Visualization for Parallel
Previous: Introduction
Karen D. Toonen
1998-11-19