Workshop Room

Room C

Workshop Program

[14:00pm - 14:20pm] Opening Remarks

Dr. John Leidel, Tactical Computing Laboratories


[14:20pm - 15:40pm] Session 1: Paper

  • [14:20pm - 14:40pm] "Constructing Skeleton for Parallel Applications with Machine Learning Methods", Zihang Zhang, Jingwei Sun, Jiepeng Zhang, Guangzhong Sun and Yuze Qin.
  • [14:40pm - 15:00pm] "MPI Collectives for Multi-core Clusters: Optimized Performance of the Hybrid MPI+MPI Parallel Code", Huan Zhou, Jose Gracia and Ralf Schneider.
  • [15:00pm - 15:20pm] "Pyne: A programming framework for parallel simulation development", Hiroya Matsuba, Motohiko Matsuda and Masatoshi Kawai.
  • [15:20pm - 15:40pm] "Collective Communication for the RISC-V xBGAS ISA Extension", Brody Williams, Xi Wang, John Leidel and Yong Chen.

[15:40pm - 16:10pm] Break


[16:10pm - 16:40pm] Session 2: Invited Talk

Empowering Data-driven Discovery with Provenance Collection, Management, and Analysis

Dr. Yong Chen, Texas Tech University

Abstract:

Scientific breakthroughs are increasingly powered by advanced computing and data analysis capabilities delivered by high performance computing (HPC) systems. In the meantime, many scientific problems have moved to a level of complexity that the ability of understanding the results, auditing how a result is generated, and reproducing the important experiments or simulation results, is critical to scientists. Enabling such a capability in HPC systems requires a holistic collection, management, and analysis for "provenance" data, the metadata that describes the history of a piece of data. With such a capability, many advanced data management functionalities such as identifying the data sources, parameters, or assumptions behind a given result, auditing data history and usage, or understanding the detailed process that how different input data are transformed into outputs can be possible. This talk will introduce our current work in this space and discuss further directions.


[16:40pm - 17:20pm] Session 3: Keynote Talk

Towards More Adaptivity, Even With MPI

Dr. Martin Shulz, Technische Universität München

Abstract:

Current HPC environments and applications are rather rigid and inflexible, and MPI's inability to efficiently support malleability, i.e., the ability to grow and shrink the computational resources associated with a job at runtime, is a significant part of the problem. While this is likely not going to change for exascale systems anymore, in a Post-Exascale world, however, we will require a more flexible approach, e.g., to support a greater level of fault tolerance, to adjust to changing levels of available resources, or to match more complex workflows. In order for MPI to maintain its dominant role in HPC, it will have to change and become more adaptive. In this talk I will discuss the challenges facing MPI in these scenarios as well as several approaches that are first steps towards supporting malleability in MPI. They will open the door for MPI to both support a new generation of applications as well as to provide more flexible runtime support for higher level programming models.

Bio:

Martin Schulz is a Full Professor and Chair for Computer Architecture and Parallel Systems at the Technische Universität München (TUM), which he joined in 2017, as well as a member of the board of directors at the Leibniz Supercomputing Centre. Prior to that, he held positions at the Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory (LLNL) and Cornell University. He earned his Doctorate in Computer Science in 2001 from TUM and a Master of Science in Computer Science from UIUC. Martin has published over 200 peer-reviewed papers and currently serves as the chair of the MPI Forum, the standardization body for the Message Passing Interface. His research interests include parallel and distributed architectures and applications; performance monitoring, modeling and analysis; memory system optimization; parallel programming paradigms; tool support for parallel programming; power-aware parallel computing; and fault tolerance at the application and system level. Martin was a recipient of the IEEE/ACM Gordon Bell Award in 2006 and an R&D 100 award in 2011.


[17:20pm - 17:30pm] Closing Remarks

Dr. John Leidel, Tactical Computing Laboratories


Workshop Photos