At the lowest level, what is really needed is just a way to transfer data, possibly in small amounts, from one process's address space to another's. Although many implementations are possible, the specification can be done with a small number of definitions. The channel interface, described in more detail in [28], consists of only five required functions. Three routines send and receive envelope (or control) information: MPID_SendControl,One can use MPID_SendControlBlock instead of or along with MPID_SendControl. It can be more efficient to use the blocking version for implementing blocking calls. MPID_RecvAnyControl, and MPID_ControlMsgAvail; two routines send and receive data: MPID_SendChannel and MPID_RecvFromChannel. Others, which might be available in specially optimized implementations, are defined and used when certain macros are defined that signal that they are available. These include various forms of blocking and nonblocking operations for both envelopes and data.
These operations are based on a simple capability to send data from one process to another process. No more functionality is required than what is provided by Unix in the select, read, and write operations. The ADI code uses these simple operations to provide the operations, such as MPID_Post_recv, that are used by the MPI implementation.
The issue of buffering is a difficult one. We could have defined an interface that assumed no buffering, requiring the ADI that calls this interface to perform the necessary buffer management and flow control. The rationale for not making this choice is that many of the systems used for implementing the interface defined here do maintain their own internal buffers and flow controls, and implementing another layer of buffer management would impose an unnecessary performance penalty.
The channel interface implements three different data exchange mechanisms.
This choice often offers the highest performance, particularly when the underlying implementation provides suitable buffering and handshakes. However, it can cause problems when large amounts of data are sent before their matching receives are posted, causing memory to be exhausted on the receiving processors.
This is the default choice in MPICH.
This choice is the most robust but, depending on the underlying system software, may be less efficient than the eager protocol. Some legacy programs may fail when run using a rendezvous protocol if an algorithm is unsafely expressed in terms of MPI_Send. Such a program can be safely expressed in terms of MPI_Bsend, but at a possible cost in efficiency. That is, the user may desire the semantics of an eager protocol (messages are buffered on the receiver) with the performance of the rendezvous protocol (no copying) but since buffer space is exhaustible and MPI_Bsend may have to copy, the user may not always be satisfied.
MPICH can be configured to use this protocol by specifying -use_rndv during configuration.
This choice offers the highest performance but requires special hardware support such as shared memory or remote memory operations. In many ways, it functions like the rendezvous protocol, but uses a different set of routines to transfer the data.
To implement this protocol, special routines must be provided to prepare the address for remote access and to perform the transfer. The implementation of this protocol allows data to be transferred in several pieces, for example, allowing arbitrarily sized messages to be transferred using a limited amount of shared memory. The routine MPID_SetupGetAddress is called by the sender to determine the address to send to the destination. In shared-memory systems, this may simply be the address of the data (if all memory is visible to all processes) or the address in shared-memory where all (or some) of the data has been copied. In systems with special hardware for moving data between processors, it may be the appropriate handle or object.

Figure 8: Lower layers of MPICH