Argonne National Laboratory Mathematics and Computer Science Division
Argonne Home > MCS Division > Seminar & Events

Seminars & Events

Bookmark and Share

Mathematics and Computer Science Division
"Can MPI-2 RMA Operations Benefit Real World Applications?"

DATE: May 5, 2010 to May 5, 2010
TIME: 2:00 PM - 3:00 PM
SPEAKER: Dr. Sayantan Sur, Research Scientist at the Department of Computer Science at The Ohio State University
LOCATION: Building 240, Conference Room 4301, Argonne National Laboratory
HOST: Pavan Balaji

Description:
The MPI-2 specification introduced Remote Memory Access (RMA) semantics in the late 1990s. RMA semantics hold the promise effectively leveraging Remote Direct Memory Access (RDMA) feature provided by advanced network adapters and helping scientific applications achieve good communication and computation overlap. However, only a relatively small number of real applications currently take advantage of RMA to overlap communication latency. While some researchers have raised questions about the usability of this interface, others have questioned the performance and scalability offered by RMA operations.

In this work, we demonstrate how a real NSF TeraGrid application, AWM-Olsen (recently renamed to AWM-ODC), can be modified to expose computation and communication overlap. This application runs on tens of thousands of cores and consumes several million CPU hours on the TeraGrid Clusters every year. Some of the most detailed simulations to date of earthquakes along the San Andreas fault were carried out using this code, including the well-known TeraShake, SCEC ShakeOut simulations. A significant portion of its run-time (37% in a 4K process run), is spent in MPI communication routines. Using our modified AWM-ODC application and leveraging MPI-2 Active Target Synchronization semantics, the performance of the application can be boosted by 12% for 4K processors and 10% at 8K processors.

BIO: Dr. Sayantan Sur is a Research Scientist at the Department of Computer Science at The Ohio State University. His research interests include high speed interconnection networks, high performance computing, fault tolerance and parallel computer architecture. He has published more than 15 papers in major conferences and journals related to these research areas. He is a member of the Network-Based Computing Laboratory lead by Dr. D. K. Panda. He is currently collaborating with National Laboratories, Supercomputer Centers, and leading InfiniBand companies on designing various subsystems of next generation high performance computing platforms. He has contributed significantly to the MVAPICH/MVAPICH2 (High Performance MPI over InfiniBand and 10GigE/iWARP) open-source software packages. The software developed as a part of this effort is currently used by over 1,110 organizations in 56 countries. In the past, he has held the position of Post-doctoral researcher at IBM T. J. Watson Research Center, Hawthorne and Member Technical Staff at Sun Microsystems. Dr. Sur received his Ph.D. degree from The Ohio State University in 2007.


Save the event to your calendar [schedule.ics]


The Office of Advanced Scientific Computing Research | UChicago Argonne LLC | Privacy & Security Notice | ContactUs