next up previous
Next: Clusters Up: Architecture Previous: Monitors

   
Message Passing

A small amount of performance has been sacrificed in order to make most of p4's implementation code portable. During a p4_send, the user's data is copied into a p4 buffer, which contains a 40-byte header. (This step is bypassed if the user obtains the buffer complete with header by means of p4_msg_alloc and builds his message in it.) Once the buffer has been packed with the message and the header information (destination, sender, message type, length, data type, acknowledge-request-flag), p4 looks up the destination in a table to determine how to deliver it. If the sending and receiving process share memory, then the buffer is just placed in the destination process's queue. If there is a machine-specific send operation available (e.g., the two processes are on an iPSC/860), then the appropriate vendor-specific send operation is used. If the message must travel over a TCP/IP network, then if a socket is already open to the destination process, it is used; otherwise a socket is opened first. Note that only connections that are going to actually be used are opened. Currently such sockets are left open, and one can run out of them (although this seldom happens; modern workstations support lots of open sockets). A more sophisticated p4 implementation may close sockets to reuse their file descriptors. xdr is used to translate messages between machines with different data formats. This is done only when absolutely necessary; p4 contains a table of those pairs of machines that require translation.

In some cases the copying of messages into p4-maintained buffers can speed things up. On the Intel iPSC/860 and DELTA we can use Intel's isend and return immediately to the user instead of csend, which blocks until the message has been sent. The p4 buffer is flagged as ``in use,'' and the isend is waited on only when the buffer is really needed later, by which time the isend will probably have completed.

During a receive operation, all possible sources of incoming messages are checked until the criteria (source and type) specified on the p4_recv are satisfied. For transmission layers where the size of a message is made available before the message is read (this is done on TCP/IP networks by reading the header before the rest of the message), a buffer is allocated for the message to be read into. The p4_recv returns a pointer to this buffer. Thus a user need not know ahead of time the size of the message. Alternatively, the user can allocate a buffer ahead of time. This approach allows reuse of the same area by the user for multiple messages.

Allocation and deallocation of buffers are optimized by maintaining a pool of available buffers of varying sizes. The sizes of these buffer pools can be set by the user, using p4_set_buf. The default is to maintain pools for messages of sizes 64, 256, 1K, 4K, 16K, 256K, and 1M.


next up previous
Next: Clusters Up: Architecture Previous: Monitors
Karen D. Toonen
1998-11-19