AdjoinableMPI
 All Classes Files Functions Variables Typedefs Enumerations Enumerator Macros Groups Pages
User Guide

Introduction

The Adjoinable MPI (AMPI) library provides a modified set of MPI subroutines that are constructed such that:

  • an adjoint in the context of algorithmic differentiation (AD) can be computed,
  • it can be supported by a variety of AD tools,
  • it enable also the computation of (higher-order) forward derivatives,
  • it provides an implementation for a straight pass-through to MPI such that the switch to AMPI can be made permanent without forcing compile dependencies on any AD tool.

There are principal recipes for the construction of the adjoint of a given communication, see[1] . The practical implementation of these recipes, however, faces the following challenges.

  • the target language may prevent some implementation options
    • exposing an MPI_Request augmented with extra information as a structured type (not supported by Fortran 77)
    • passing an array of buffers (of different length), e.g. to AMPI_Waitall, as an additional argument to (not supported in any Fortran version)
  • the AD tool implementation could be based on
    • operator overloading
      • original data and (forward) derivatives co-located (e.g. Rapsodia,dco)
      • original data and (forward) derivatives referenced (e.g. Adol-C)
    • source transformation
      • association by address (e.g. OpenAD)
      • association by name (e.g. Tapenade)

The above choices imply certain consequences on the complexity for implementing the adjoint (and forward derivative) action and this could imply differences in the AMPI design. However, from a user's perspective it is a clear advantage to present a single, AD tool implementation independent AMPI library such that switching AD tools is not hindered by AMPI while also promoting a common understanding of the differentiation through MPI calls. We assume the reader is familiar with MPI and AD concepts.

Getting the library sources

The sources can be accessed through the AdjoinableMPI mercurial repository. Bug tracking, feature requests etc. are done via trac. In the following we assume the sources are cloned (cf mercurial web site for details about mercurial) into a directory AdjoinableMPI by invoking

hg clone http://mercurial.mcs.anl.gov/ad/AdjoinableMPI

Library - Configure, Build, and Install

Configuration, build, and install follows the typical GNU autotools chain. Go to the source directory

cd AdjoinableMPI

If the sources were obtained from the mercurial repository, then one first needs to run the autotools via invoking

./autogen.sh

In the typical autoconf fashion invoke

configure --prefix=<installation directory> ...

in or outside the source tree. The AD tool supporting AMPI should provide information which detailed AMPI configure settings are required if any. Build the libaries with

make

Optionally, before installing, one can do a sanity check by running: make check .

To install the header files and compiled libraries follow with

make install

after which in the installation directory one should find under <installation directory> the following.

  • header files: see also Directory and File Structure
  • libraries:
    • libampiPlainC - for pass through to MPI, no AD functionality
    • libampiCommon - implementation of AD functionality shared between all AD tools supporting AMPI
    • libampiBookkeeping - implementation of AD functionality needed by some AD tools (see the AD tool documentation)
    • libampiTape - implementation of AD functionality needed by some AD tools (see the AD tool documentation)

Note, the following libraries are AMPI internal:

  • libampiADtoolStubsOO - stubs for operator overloading AD tools not needed by the user
  • libampiADtoolStubsST - stubs for source transformation AD tools not needed by the user

Switching from MPI to Adjoinable MPI

For a given MPI-parallelized source code the user will replace all calls to MPI_... routines with the respective AMPI_... equivalent provided in User-Interface declarations. To include the declarations replace

  • in C/C++: includes of mpi.h with
    #include <ampi/ampi.h>
  • in Fortran: includes of mpif.h with
    #include <ampi/ampif.h>

respectively.

Because in many cases certain MPI calls (e.g. for initialization and finalization) take place outside the scope of the original computation and its AD-derivatives and therefore do not themselves become part of the AD process, see the explanations in Using subroutine variants NT vs non-NT relative to the differentiable section. Each routine in this documentation lists to the changes to the parameters relative to the MPI standard. These changes impact parameters specifying

Some routines require new parameters specifying the pairing two-sided communications, see Pairings. Similarly to the various approaches (preprocessing, templating, using typedef) employed to effect a change to an active type for overloading-based AD tools, this switch from MPI to AMPI routines should be done as a one-time effort. Because AMPI provides an implementation for a straight pass-through to MPI it is possible to make this switch permanent and retain builds that are completely independent of any AD tool and use AMPI as a thin wrapper library to AMPI.

Application - compile and link

After the switch described in Switching from MPI to Adjoinable MPI is done, the application should be recompiled with the include path addition

-I<installation directory>/include

and linked with the link path extension

-L<installation directory>/lib[64]

Note, the name of the subdirectory (lib or lib64 ) depends on the system; the appropriate set of libraries, see Library - Configure, Build, and Install; the optional ones in square brackets depend on the AD tool:

-libampicommon [ -libampiBookkeeping -lampiTape ]

OR if instead of differentiation by AD a straight pass-through to MPI is desired, then

-libampiPlainC

instead.

Directory and File Structure

All locations discussed below are relative to the top level source directory. The top level header file to be included in place of the usual "mpi.h" is located in ampi/ampi.h

It references the header files in ampi/userIF , see also User-Interface header files which are organized to contain

  • unmodified pass through to MPI in ampi/userIF/passThrough.h which exists to give the extent of the original MPI we cover
  • variants of routines that in principle need adjoint logic but happen to be called outside of the code section that is adjoined and therefore are not transformed / not traced (NT) in ampi/userIF/nt.h
  • routines that are modified from the original MPI counterparts because their behavior in the reverse sweep differs from their behavior in the forward sweep and they also may have a modified signatyre; in ampi/userIF/modified.h
  • routines that are specific for some variants of source transformation (ST) approaches in ampi/userIF/st.h; while these impose a larger burden for moving from MPI to AMPI on the user, they also enable a wider variety of transformations currently supported by the tools; we anticipate that the ST specific versions may become obsolete as the source transformation tools evolve to support all transformations via the routines in ampi/userIF/modified.h

Additional header files contain enumerations used as arguments to AMPI routines. All declarations that are part of the user interface are grouped in User-Interface declarations. All other declarations in header files in the library are not to be used directly in the user code.

A library that simply passes through all AMPI calls to their MPI counterparts for a test compilation and execution without any involvement of and AD tool is implemented in the source files in the PlainC directory.

Using subroutine variants NT vs non-NT relative to the differentiable section

The typical assumption of a program to be differentiated is that there is some top level routine head which does the numerical computation and communication which is called from some main driver routine. The driver routine would have to be manually adjusted to initiate the derivative computation, retrieve, and use the derivative values. Therefore only head and everything it references would be adjoined while driver would not. Typically, the driver routine also includes the basic setup and teardown with MPI_Init and MPI_Finalize and consequently these calls (for consistency) should be replaced with their AMPI "no trace/transformation" (NT) counterparts AMPI_Init_NT and AMPI_Finalize_NT. The same approach should be taken for all resource allocations/deallocations (e.g. AMPI_Buffer_attach_NT and AMPI_Buffer_detach_NT) that can exist in the scope enclosing the adjointed section alleviating the need for the AD tool implementation to tackle them. For cases where these routines have to be called within the adjointed code section the variants without the _NT suffix will ensure the correct adjoint behavior.

General Assumptions on types and Communication Patterns

Datatype consistency

Because the MPI standard passes buffers as void* (aka choice) the information about the type of the buffer and in particular the distinction between active and passive data (in the AD sense) must be conveyed via the datatype parameters and be consistent with the type of the buffer. To indicate buffers of active type the library predefines the following

  • for C/C++
    • AMPI_ADOUBLE as the active variant of the passive MPI_DOUBLE
    • AMPI_AFLOAT as the active variant of the passive MPI_FLOAT
  • for Fortran

Passive buffers can be used as parameters to the AMPI interfaces with respective passive data type values.

Request Type

Because additional information has to be attached to the MPI_Request instances used in nonblocking communications, there is an expanded data structure to hold this information. Even though in some contexts (F77) this structure cannot be exposed to the user code the general approach is to declare variables that are to hold requests as AMPI_Request (instead of MPI_Request).

Pairings

Following the explanations in[1] it is clear that context information about the communication pattern, that is the pairing of MPI calls, is needed to achieve

  1. correct adjoints, i.e. correct send and receive end points and deadlock free
  2. if possible retain the efficiency advantages present in the original MPI communication for the adjoint.

In AMPI pairings are conveyed via additional pairedWith parameters which may be set to AMPI_PairedWith enumeration values , see e.g. AMPI_Send or AMPI_Recv. The need to convey the pairing imposes restrictions because in a given code the pairing may not be static. For a example a given MPI_Recv may be paired with

if (doBufferedSends)
MPI_Bsend(...);
else
MPI_Ssend(...);

but the AD tool has to decide on the send mode once the reverse sweep needs to adjoin the orginal MPI_Recv. Tracing such information in a global data structure is not scalable and piggybacking the send type onto the message so it can be traced on the receiving side is conceivable but not trivial and currently not implemented.

Restriction:
Pairing of send and receive modes must be static.

Note that this does not prevent the use of wild cards for source, or tag.

Examples

A set of examples organized to illustrate the uses of AMPI together with setups for AD tools that also serve as regression tests are collected in AdjoinableMPIexamples that can be obtained similarly to the AMPI sources themselves by cloning

hg clone http://mercurial.mcs.anl.gov/ad/AdjoinableMPIexamples

The daily regression tests based on these examples report the results on the page linked via the main page of this documentation.