Tuning MPI Programs for Peak Performance
Short Version
William Gropp
Tuning MPI Programs for Peak Performance
Overview
Background and Models
Background
What is message passing?
Quick review of MPI Message passing
Abstract Model of MPI Implementation
The MPI Automaton
Message protocols
Special Protocols for DSM
Message Protocol Details
Eager Protocol
Eager Features
How Scaleable is Eager Delivery?
Rendezvous Protocol
Rendezvous Features
Short Protocol
User and System Buffering
Packetization
Non-contiguous Datatypes
Synchronization Delays
Polling Mode MPI
Interrupt Mode MPI
Example of the effect of Polling
More on Synchronization Delays
Related effects
Contention
Effect of contention
Memory copies
Performance of MPI Datatypes
Packet sizes/stepping
Example of Packetization
Correlating processes for synchronization delays
MPI-Specific Tuning
Constant stride datatypes
Contiguous Structures
Improving structure performance
Tuning for MPI protocols
Aggressive Eager
Tuning for Aggressive Eager
Rendezvous with Sender Push
Rendezvous Blocking
Tuning for Rendezvous with Sender Push
Rendezvous with Receiver Pull
Tuning for Rendezvous with Receiver Pull
Scheduling for contention
Some Example Results
Summary of Results
Pitfalls I
Pitfalls II
Review of Techniques I
Review of Techniques II
Review of Techniques III