Tuning MPI Programs for Peak Performance
Short Version

William Gropp

Tuning MPI Programs for Peak Performance

Overview

Background and Models

Background

What is message passing?

Quick review of MPI Message passing

Abstract Model of MPI Implementation

The MPI Automaton

Message protocols

Special Protocols for DSM

Message Protocol Details

Eager Protocol

Eager Features

How Scaleable is Eager Delivery?

Rendezvous Protocol

Rendezvous Features

Short Protocol

User and System Buffering

Packetization

Non-contiguous Datatypes

Synchronization Delays

Polling Mode MPI

Interrupt Mode MPI

Example of the effect of Polling

More on Synchronization Delays

Related effects

Contention

Effect of contention

Memory copies

Performance of MPI Datatypes

Packet sizes/stepping

Example of Packetization

Correlating processes for synchronization delays

MPI-Specific Tuning

Constant stride datatypes

Contiguous Structures

Improving structure performance

Tuning for MPI protocols

Aggressive Eager

Tuning for Aggressive Eager

Rendezvous with Sender Push

Rendezvous Blocking

Tuning for Rendezvous with Sender Push

Rendezvous with Receiver Pull

Tuning for Rendezvous with Receiver Pull

Scheduling for contention

Some Example Results

Summary of Results

Pitfalls I

Pitfalls II

Review of Techniques I

Review of Techniques II

Review of Techniques III