Support for Performance Analysis and Debugging


Up: Toward a Portable Parallel Programming Environment Next: Profiling Libraries Previous: Command-Line Arguments and Standard I/O

The MPI profiling interface allows the convenient construction of portable tools that rely on intercepting calls to the MPI library. Such tools are ``ultra portable'' in the sense that they can be used with any MPI implementation, not just a specific portable MPI implementation.



Up: Toward a Portable Parallel Programming Environment Next: Profiling Libraries Previous: Command-Line Arguments and Standard I/O


Profiling Libraries


Up: Support for Performance Analysis and Debugging Next: Upshot Previous: Support for Performance Analysis and Debugging

The MPI specification makes it possible, but not particularly convenient, for users to build their own ``profiling libraries,'' which intercept all MPI library calls. MPICH comes with three profiling libraries already constructed; we have found them useful in debugging and in performance analysis.

tracing
The tracing library simply prints (on stdout) a trace of each MPI library call. Each line is identified with its process number (rank in MPI_COMM_WORLD). Since stdout from all processes is collected, even on a network of workstations, all output comes out on the console. A sample is shown here.
... 
    [1] Starting MPI_Bcast... 
    [0] Starting MPI_Bcast... 
    [0] Ending MPI_Bcast 
    [2] Starting MPI_Bcast... 
    [2] Ending MPI_Bcast 
    [1] Ending MPI_Bcast 
         ... 
  
logging
The logging library uses the mpe logging routines described in Section The MPE Extension Library to write a logfile with events for entry to and exit from each MPI function. Then upshot (see Section Upshot ) can be used to display the computation, and its colored bars will show the frequency and duration of each MPI call. (See Figure 9 .)
animation
The animation library uses the mpe graphics routines to provide a simple animation of the message passing that occurs in an application, via a shared X display.

Further description of these libraries can be found in [34].



Up: Support for Performance Analysis and Debugging Next: Upshot Previous: Support for Performance Analysis and Debugging


Upshot


Up: Support for Performance Analysis and Debugging Next: Support for Adding New Profiling Libraries Previous: Profiling Libraries

One of the most useful tools for understanding parallel program behavior is a graphical display of parallel timelines with colored bars to indicate the state of each process at any given time. A number of tools developed by various groups do this. One of the earliest of these was upshot [33]. Since then upshot has been reimplemented in Tcl/Tk, and this version [34] is distributed with MPICH. It can read log files generated either by Paragraph [32] or by the mpe logging routines, which are in turn used by the logging profiling library. A sample screen dump is shown in Figure 9 .


Figure 9:  Upshot output



Up: Support for Performance Analysis and Debugging Next: Support for Adding New Profiling Libraries Previous: Profiling Libraries


Support for Adding New Profiling Libraries


Up: Support for Performance Analysis and Debugging Next: Useful Commands Previous: Upshot

The most obvious way to use the profiling library is to choose some family of calls to intercept, and then treat each of them in a special way. Typically, one performs some action (adds to a counter, prints a message, writes a log record), calls the ``real'' MPI function using its alternate name PMPI_Xxxx, perhaps performs another action (e.g., writes another log record), and then returns to the application, propagating the return code from the PMPI routine.

MPICH includes a utility called wrappergen that lets a user specify ``templates'' for profiling routines and a list of routines to create, and then automatically creates the profiling versions of the specified routines. Thus the work required by a user to add a new profiling library is reduced to writing individual MPI_Init and MPI_Finalize routines and one template routine. The libraries described above in Section Profiling Libraries are all produced in this way. Details of how to use wrappergen can be found in [27].



Up: Support for Performance Analysis and Debugging Next: Useful Commands Previous: Upshot