Next: Upshot
Up: VISUALIZING PARALLEL PROGRAM BEHAVIOR
Previous: Event-Based Logging
There is no need for a particularly tight coupling between the
logfile-collection mechanism and the display mechanism. In fact, a
standardized logfile format has been recently proposed to allow users to use
quite different display programs to view the same
logfile [Worley 1992]. Although the main focus of this paper is on
display programs, for concreteness we describe alog, the mechanism
currently being used to collect logfiles for upshot and PADL.
Alog is a macro package for C (and a subroutine package for Fortran)
designed for the efficient collection of an event log. Its main features
are:
- simplicity
- It has a fixed format consisting of 6 integers (event
type, process number, user task number, user data item, time cycle (since
timers roll over), time stamp, and a short (12-character) user string.
Details can be found in [Herrarte and Lusk 1991].
- speed
- Logging of events consists of putting the above data in an
in-memory buffer. In parallel jobs, separate buffers are used, even on a
shared-memory machine, so no locking is necessary. Logs are written out to
separate files on disk only at the end of the run.
- portability
- Alog is built on a portable microsecond clock
package (distributed with alog), that takes advantage of
machine-specific high-resolution timers on various machines to produce a
as closely as possible a microsecond-resolution timestamp.
- automatic alignment
- Not only on workstation networks, but also on
distributed-memory machines, the clocks read by individual processes are not
synchronized. We use logged synchronization events to place data in the
logfile that can be used by a post-processing program to align the separate
log files into a single file of events in time order.
- uninterpreted events
- Although several systems have built alog
events logging into the system itself (p4, PCN, Strand), the user is actually
free to log any events he wants. We have found logging most useful when the
events refer to high-level, application-dependent events.
- states are secondary
- While many display mechanisms include the notion
of an event duration, in alog events are atomic, and it is a separate
step for the user to assign pairs of event types to entry to and exit from a
user-specified state.
The post-processing step of alog is easy to tune to add information to
the logfile that is designed to aid the display program that will read the
file. This is done in the case of upshot and PADL.
Next: Upshot
Up: VISUALIZING PARALLEL PROGRAM BEHAVIOR
Previous: Event-Based Logging
Karen D. Toonen
1998-11-19