Performance Analysis and Engineering in TOPS

The purpose of this page is to provide information on TOPS for use by the PERC team.   Please send corrections and comments to

Linear Solvers

The TOPS linear solvers will be based primarily on next-generation implementations of PETSc and hypre.  PETSc has extensive instrumentation for performance analysis.  See the PETSc documentation, especially the section on profiling.  There is less support for profiling in hypre.  One PERC research opportunity is automatic instrumentation of hypre with profiling capabilities that can interoperate with those in PETSc (the TOPS team is also evaluating the possibility of generating self-monitoring capabilities using Babel, a CCA project).  PETSc is implemented in C, but uses its own function tables and dispatch mechanism to provide a complete OO programming model.  PETSc 3.0 will be written in C++.

Test Problems

The TOPS team is working with several SciDAC applications.  In addition, they are developing several implementations of a model problem, nonlinear heat transfer.  Links to these implementations will be posted here as they become available.  These test problems will initially use the PETSc linear and nonlinear solvers.  Another test problem recommended  by David Keyes is PETSc nonlinear solver example nineteen, a 2-d driven cavity code that uses a velocity-vorticity formulation and a finite difference discretization on a structured grid.

Using PETSc

Assuming PETSc 2.1.1 is already installed at your site (if not, see the PETSc website or e-mail Paul Hovland for assistance), building and executing the test problems is a simple process.

  1. Make sure the PETSC_DIR and PETSC_ARCH environment variables are set correctly (there's a utility program in $PETSC_DIR/bin called petscarch to assist with setting the latter).
  2. Copy ex19.c and makefile from $PETSC_DIR/src/snes/examples/tutorials (or clone the entire directory).
  3. Build an executable with debug info using 'make BOPT=g ex19' or with optimization using 'make BOPT=O ex19'.  Note that if you want to use the C++ compiler to compile C, you may need to use BOPT=g_c++ or BOPT=O_c++ instead.  You can check which libraries have been built at your site by looking for $PETSC_DIR/lib/lib$BOPT/$PETSC_ARCH/.
  4. Run the executable as you would any other MPI program  For example, if you're using MPICH,

            mpirun -np 2 ex19

  1. This should produce output of the form:

lid velocity = 0.0204082, prandtl # = 1, grashof # = 1
Number of Newton iterations = 2
lid velocity = 0.0204082, prandtl # = 1, grashof # = 1
Number of Newton iterations = 2

There are hundreds of command line options available for any PETSc application.  You can get a context-sensitive list of available options using the -help option.  You may wish to pipe the output into your favorite page program.  Note that because the help is context-sensitive, 'ex19 -pc_type none -help' and 'ex19 -help' will return different option lists.

We recommend using the options

    ex19 -grashof 500 -da_grid_x 50 -da_grid_y 50 -snes_monitor

This should produce output of the form:

lid velocity = 0.00010203, prandtl # = 1, grashof # = 500
0 SNES Function norm 5.049979275323e+00 
1 SNES Function norm 2.260377609539e-01 
2 SNES Function norm 2.859555935630e-04 
3 SNES Function norm 1.886111062771e-09 
Number of Newton iterations = 3

To obtain a performance summary, use the -log_summary option.  Note that the output assumes a wide (120 column) terminal.   The output for the options above should look something like this

Increasing -grashof or -lidvelocity will make the problem harder.  David Keyes indicates that up to 10^5 is realistic for -grashof and up to 10^2 is realistic for -lidvelocity (10^4 if you don't mind the problem being nonphysical).  Grid size (-da_grid_x and -da_grid_y) can go up to 100 x 100 or so, especially if you're using many processors.

PETSc is a large and complex library.  Feel free to e-mail Paul Hovland ( or Boyana Norris ( with PERC-related questions.  The PETSc team ( may be able to assist with general configuration and usage questions.