The Ninth International Workshop on
Accelerators and Hybrid Exascale Systems
(AsHES)
Join us on May 20th, 2019
Rio de Janeiro, Copacabana, Brazil
To be held in conjunction with
33rd IEEE International Parallel and Distributed Processing Symposium

Opening Remarks

8:45 - 9:00 am

Keynote

9:00 - 10:00 am

Performance Portability with Data-Centric Parallel Programming

Torsten Hoefler, ETH Zürich, Switzerland

Abstract: The ubiquity of accelerators in high-performance computing has driven programming complexity beyond the skill-set of the average domain scientist. To maintain performance portability in the future, it is imperative to decouple architecture-specific programming paradigms from the underlying scientific computations. We present the Stateful DataFlow multiGraph (SDFG), a data-centric intermediate representation that enables separating code definition from its optimization. We show how to tune several applications in this model and IR. Furthermore, we show a global, datacentric view of a state-of-the-art quantum transport simulator to optimize its execution on supercomputers. The approach yields coarse and fine-grained data-movement characteristics, which are used for performance and communication modeling, communication avoidance, and data-layout transformations. The transformations are tuned for the Piz Daint and Summit supercomputers, where each platform requires different caching and fusion strategies to perform optimally. We show that SDFGs deliver competitive performance, allowing domain scientists to develop applications naturally and port them to approach peak hardware performance without modifying the original scientific code.

Bio: Torsten is an Associate Professor of Computer Science at ETH Zürich, Switzerland. Before joining ETH, he led the performance modeling and simulation efforts of parallel petascale applications for the NSF-funded Blue Waters project at NCSA/UIUC. He is also a key member of the Message Passing Interface (MPI) Forum where he chairs the "Collective Operations and Topologies" working group. Torsten won best paper awards at the ACM/IEEE Supercomputing Conference SC10, SC13, SC14, EuroMPI'13, HPDC'15, HPDC'16, IPDPS'15, and other conferences. He published numerous peer-reviewed scientific conference and journal articles and authored chapters of the MPI-2.2 and MPI-3.0 standards. He received the Latsis prize of ETH Zurich as well as an ERC starting grant in 2015. His research interests revolve around the central topic of "Performance-centric System Design" and include scalable networks, parallel programming techniques, and performance modeling. Additional information about Torsten can be found on his homepage at htor.inf.ethz.ch.

Coffee break 10:00 - 10:30 am

Session 1: GPU Algorithms

10:30 am - 12:00 pm
Session Chair: Stefano Markidis, KTH Royal Institute of Technology, Sweden

  • Javelin: A Scalable Implementation for Sparse Incomplete LU Factorization
    Joshua Booth, and Gregory Bolet
  • Approximate and Exact Selection on GPUs [Invited]
    Tobias Ribizel, and Hartwig Anzt
  • An Adaptive Algorithm for Parallel Sparse Triangular Solve on Heterogeneous Processors [Invited]
    Weifeng Liu, and Hemeng Wang
    [Abstract]

Lunch break 12:00 - 1:00 pm

Session 2: Communication and Memory

1:00 - 2:30 pm
Session Chair: Hartwig Anzt, Karlsruhe Institute of Technology, Germany & University of Tennessee, USA

  • Parallel Processing on FPGA Combining Computation and Communication in OpenCL Programming
    Norihisa Fujita, Ryohei Kobayashi, Yoshiki Yamaguchi, and Taisuke Boku
  • GPU-FPGA Heterogeneous Computing with OpenCL-enabled Direct Memory Access
    Ryohei Kobayashi, Norihisa Fujita, Yoshiki Yamaguchi, Ayumi Nakamichi, and Taisuke Boku
  • Evaluating the Impact of High-Bandwidth Memory on MPI Communications [Invited]
    Giuseppe Congiu, and Pavan Balaji
    [Abstract] [Presentation]

Coffee break 2:30 - 3:00 pm

Session 3: Performance and Energy Analysis

3:00 - 4:00 pm
Session Chair: Giuseppe Congiu, Argonne National Laboratory, USA

  • Analysis of Energy Efficiency of a Parallel AES Algorithm for CPU-GPU Heterogeneous Platforms
    Xiongwei Fei, Kenli Li, Wangdong Yang, and Keqin Li
  • TensorFlow Doing HPC [Invited]
    Steven Wei-Der Chien, Stefano Markidis, Vyacheslav Olshevsky, Yaroslav Bulatov, Erwin Laure, and Jeffrey Vetter