Celebrating 25 Years of MPI

This symposium brings together many of those who have participated in the development of the Message Passing Interface from before it even had a name to active members of the current (MPI-3) Forum. We will review the historical beginnings of MPI, discuss its strengths and weaknesses, and try to predict the future.

-- Rusty Lusk and Jesper Larsson Träff, Organizers

Venue

The symposium will take place all day on September 25th, 2017 at Argonne National Laboratory, near Chicago, as part of EuroMPI/USA 2017.

Agenda

Time Event

08:20-08:30

Rusty Lusk: Opening Remarks

08:30-09:40

The Prehistory

Jack Dongarra: In the Beginning
David Walker: Some Reflections on the MPI Forum 1993-5
Tony Hey: Early Days of Message-Passing Computing: Transputers, Occam and All That
Rolf Hempel: Suprenum, PARMACS and Transputers: A Personal View on the European way to MPI

09:40-10:00

The Beginning

Steve Huss-Lederman: Reflections on the MPI Process

10:00-10:20

Morning Break

10:20-11:00

The Beginning (cont.)

Al Geist: Birth of a De Facto Standard -- Message Passing Interface
Tony Skjellum: MPI: 25 Years of Progress

11:00-12:00

The Standard

Bill Gropp: The Grass is Always Greener: Reflections on the Success of MPI and What May Come After
Rajeev Thakur: MPI-IO: A Retrospective
Rolf Rabenseifner: From MPI-1.1 to MPI-3.1, Publishing and Teaching, With a Special Focus on MPI-3 Shared Memory and the Fortran Nightmare

12:00-13:00

Lunch Break

13:00-14:40

Problems

Jim Dinan: Communicators and Windows and Threads, Oh My!
Geoffrey Fox: MPI, Dataflow, Streaming: Messaging for Diverse Requirements
Rich Graham: MPI and Modern Network Hardware

14:40-15:00

Tools

Martin Schulz: MPI Tool Interfaces: A Role Model for Other Standards !?
Hans-Christian Hoppe: MPI as a Foundation for Scalable Parallel Performance Analysis

15:00-15:20

Afternoon Break

15:20-16:20

The Future

Dan Holmes: MPI @ 35
Marc Snir: MPI is too High-Level; MPI is too Low-Level
Torsten Hoefler: A View on MPI's Recent Past, Present, and Future

16:20-17:00

Discussion

17:00

Adjourn

Abstracts

Name Title Abstract
Jim Dinan Communicators and Windows and Threads, Oh My! How MPI will (or should) adapt to accommodate evolving HPC node architectures remains an open question. Much of the discussion around this issue has focused on the semantics of core MPI constructs, including communicators, windows, and MPI processes. This presentation will discuss recent changes to the MPI specification, as well as proposed extensions, and analyze their impact on this key challenge.
Jack Dongarra In the Beginning TBD
Geoffrey Fox MPI, Dataflow, Streaming: Messaging for diverse Requirements We look at messaging needed in a variety of parallel, distributed, cloud and edge computing applications. We compare technology approaches in MPI, Asynchronous Many-Task systems, Apache NiFi, Heron, Kafka, OpenWisk, Pregel, Spark and Flink, event-driven simulations (HLA) and Microsoft Naiad. We suggest an event-triggered dataflow polymorphic runtime with implementations that trade-off performance, fault tolerance, and usability.
Al Geist Birth of a De Facto Standard -- Message Passing Interface This talk goes back to the very start when Jack Dongarra and Tony Hey first decided that a message passing standard was needed, through the early meetings with vendors, the start of the MPI forum, and the growth over several years of MPI into a de facto standard. The talk will end with a peek to exascale and how MPI will play a key role in this era.
Rich Graham MPI and Modern Network Hardware The MPI API provides a rich set of interface function that may be used to express a wide range of user-facing communication needs. However, the API is lacking the ability across the full API to pass library-level information between calls.  This talk will suggest several API changes aimed at reducing some of the software overheads in the data path.
Bill Gropp The Grass is Always Greener: Reflections on the Success of MPI and What May Come After Despite MPI's long and successful history, new programming systems intended to address MPI's many perceived problems continue to be proposed. Yet MPI remains the programming system of choice for tightly-coupled, distributed memory computing. This talk will discuss a few of the reasons for MPI's longevity and what needs to be done to get to a post-MPI era.
Rolf Hempel Suprenum, PARMACS and Transputers: A personal view on the European way to MPI In the early days of parallel computing, application programming was a tough job. In the absence of a programming standard, people had to rewrite their codes whenever a new machine arrived. The idea came up to establish a standard interface, but how should it look like? In Europe two candidates emerged: A channel-based model favored in particular by the British Transputer community, and an endpoint-based one à la PARMACS which the author had developed together with friends at Argonne National Laboratory. The author looks back on the difficult process towards a common European standard which eventually became one of the roots of the MPI initiative.
Tony Hey Early days of message-passing computing: transputers, occam and all that Inspired by the Caltech Cosmic Cube project of Geoffrey Fox and Chuck Seitz, I initiated the transputer ‘Supernode project’ in Southampton. This parallel system used a revolutionary new chip, the T800 transputer from Inmos, and led to commercial systems from Telmat and Parsys. Programming in the occam programming language was an interesting experience but as my research group moved to more conventional parallel systems such as the Meiko CS2, the need for a common message-passing standard for portability became apparent. It was important that Europe and the US did not adopt different systems so Rolf Hempel and I collaborated with Jack Dongarra and David Walker to produce the first draft for the MPI standard. After organizing a BOF at Supercomputing, this eventually led to the process that developed the first MPI standard.
Torsten Hoefler A view on MPI's recent past, present, and future MPI has been designed to standardize data-movement in a distributed memory context 25 years ago. Its wide success made it obvious that it was and remains a great match for this setting. Its library interface make it simple to implement on many architectures and simple to use from many languages. Yet, it hinders adoption to new architectures. Shared memory systems are often programmed with a mix of OpenMP and MPI. The main reason seems to be memory overheads and a second reason is performance. Yet, performance has been a long struggle on both sides, especially for producer-consumer patterns. The recent advance of accelerators that necessitate specialized programming languages and concepts makes it even harder for a library to fit in. We will discuss some problems when making MPI fit on accelerator-only systems and present a particular solution. We conclude by a summary of MPIs core principles and how they could be used in a data-centric programming framework.
Dan Holmes MPI @ 35 It is not yet known how supercomputers will be programmed in the future, but it will be called MPI.This symposium is a good opportunity to look back at the achievements of MPI, but we should also look forward to see what exciting new innovations are coming.
Hans-Christian Hoppe MPI as a foundation for scalable parallel performance analysis From the beginning, MPI did provide critical support for performance analysis and debugging tools: the profiling interface enabled portable and reliable monitoring of MPI use, and the communicator abstraction provided isolation between tools and the application code under analysis. This talk takes a look at the rapid evolution of scalable performance analysis tools for MPI, from humble beginnings to the highly sophisticated and scalable tool suites of today. It also discusses the MPI-2 and later activities on standardizing introspection interfaces, and philosophizes on how to identify the root causes of (MPI-related) performance problems and advise the end-user on addressing them.
Steve Huss-Lederman Reflections on the MPI Process MPI developed as a defacto standard. The success of this process relates to early decisions, the mix of people involved, the meeting structures, and even the social bonding involved. In this talk I will give recollections on this process.
Rolf Rabenseifner From MPI-1.1 to MPI-3.1, publishing and teaching, with a special focus on MPI-3 shared memory and the Fortran nightmare As a long-standing member of the MPI forum, I try to sketch my special way through the times of this standardization body, which also lead to become the publisher of the MPI books. From the very first, I was involved in the MPI-Fortran nightmare. At the end, we significantly enhanced the existing MPI module and added the new mpi_f08 module, which is the first one that is fully consistent with the Fortran standard. Having the MPI standard is nothing without good libraries, but having such libraries is nothing if the users do not use them. For that, I tried to develop a complete MPI course that includes all the new MPI-3.0 and MPI-3.1 methods, which were developed to better serve the needs of the parallel computing user community, including better platform and application support. My own special interest here is the new MPI-3 shared memory interface.
Martin Schulz MPI Tool Interfaces: A role model for other standards !? From the beginning, the MPI standard included a profiling interface that enabled performance tools to intercept MPI calls and record statistics of their use. Combined with efforts on the debugging side on MPIR and the MPI Message Queue debugging interface, this formed, and still forms, the basis for a rich and portable tool environment, which is invaluable for users. The MPI forum has since then expanded its efforts in this area by adding more support for tool and this has also sparked similar efforts in other standards, like OpenMP. In this talk will discuss how MPI was leading the way as a role model for other parallel programming model standards, but also discuss areas where it still has deficits and can learn from other approaches and APIs proposed or established by other standards.
Tony Skjellum MPI: 25 Years of Progress In this talk, I cover and motivate the importance of strong progress vs. weak progress and polling vs. blocking completion notification in the performance and scalability of MPI applications. Issues with moving data through the network, how to get effective overlap of communication and computation, as well as impact of jitter are all considered. With the advent of nonblocking collective operations, more and more efforts to achieve actual overlap have been undertaken, and people have in some cases gone to great lengths (even contortions) to overcome polling progress in order to achieve overlap. With exascale on the horizon, convincing arguments for supporting strong progress when needed have never been stronger; that is, production MPI's with internal concurrency and the ability to move data independently of user threads and how often they call MPI. We survey select areas of the literature that point out, for instance, where strong progress meshes better with polling completion semantics (short message programs), and where blocking notification is best (for best overlap of long messages). The value of overlap in strong-scaling situations (like GP-GPU enabled MPI clusters used for deep learning) is also mentioned. I motivate that performance-portability moving forward needs this capability to be widely available. And, I show a few results hat demonstrate this is all possible in practice, and has been implemented for over 20 years (some classical, some new).
Marc Snir MPI is too High-Level; MPI is too Low-Level People often call MPI `the assembly language of parallel programming’: As a reminder, 'An assembly language is a low-level programming language in which there is a very strong correspondence between the language and the architecture's machine code instructions’. In fact, the semantic gap between the “ISA” of a modern NIC and MPI is huge; the same is still true of existing and emerging “standard” communication API’s such as Infiniband verbs, OFI, etc. Furthermore, MPI semantics are both hard to support in hardware and often at odds with the semantic requirements of higher level languages, libraries and frameworks: MPI “hides” from the higher software layers the capabilities of modern NICs, thereby imposing superfluous overheads. In this talk I shall argue for the necessity of a true communication assembly language that can be mapped fairly directly to the capabilities of modern NICs. Such a low level interface, which I call MPI - -, can be used both to implement MPI in all its splendor, and to implement parallel programming languages, libraries and frameworks more efficiently than currently done.
Rajeev Thakur MPI-IO: A Retrospective An interface for parallel I/O was added to MPI as part of the MPI-2 Standard. Now, 20 years later, it continues to be used as a portable interface for high-performance parallel I/O by applications and as a substrate for higher-level I/O libraries, such as HDF and PnetCDF. This talk will cover the history of the MPI-IO effort, how it came about, its current status and usage, discuss its strengths and weaknesses, and what is needed to meet the I/O needs of the future.
David Walker Some Reflections on the MPI Forum 1993-5 This talk will discuss how the MPI Forum came about, how it operated, and who was involved. I’ll give some views on why I think the Forum was successful, and reflect on lessons learned from it. The rest of the talk will be taken up with anecdotes about key figures in the Forum, which I’m sure won’t be embarrassing or libellous.