G. Almasi, C. Archer, J. G. Castanos, J. Gunnels, C. C. Erway, P. Heidelberger, X. Martorell, J. E. Moreira, K. Pinnow, J. Ratterman, B. Steinmacher-burow, W. Gropp, B. Toonen, "The Design and Implementation of Message Passing Services for the BlueGene/L Supercomputer," Preprint ANL/MCS-P1183-0604, June 2004. [pdf]
The BlueGene/L supercomputer, with 65,536 dual-processor compute nodes, was designed from the group up to support efficient execution of massively parallel message passing programs. Part of this support is an optimized implementation of MPI that leverages the hardware features of BlueGene/L. MPI for BlueGene/L is implemented on top of a more basic message-passing infrastructure called the message layer. This message layer can be used both to implement other higher-level libraries and directly by applications. MPI and the message layer are used in the two modes of operation of BlueGene/L: coprocessor mode and virtual node mode. Performance measurements show that our message-passing services deliver performance close to the hardware limits of the machine. They also show that dedicating one of the processors of a node to communication functions (coprocessor mode) greatly improves the message-passing bandwidth, whereas running two processes per compute node (virtual node mode) can have a positive impact on application performance.