Final Workshop Program


Opening Remarks (08:45am - 09:00am) [slides]

Pavan Balaji, Argonne National Laboratory


Session 1 (9:00am - 10:30am): Communication

Session Chair: Vinod Tipparaju, Oak Ridge National Laboratory

  • "Efficient Zero-Copy Noncontiguous I/O for Globus on InfiniBand", Weikuan Yu and Jeffrey Vetter [slides]
  • "Scaling Linear Algebra Kernels using Remote Memory Access", Manojkumar Krishnan, Robert Lewis and Abhinav Vishnu [slides]
  • "High Performance Design and Implementation of Nemesis Communication Layer for Two-sided and One-Sided MPI Semantics in MVAPICH2", Miao Luo, Sreeram Potluri, Ping Lai, Emilio P. Mancini, Hari Subramoni, Krishna Kandalla, Sayantan Sur and Dhabaleswar K. Panda [slides]

Session 2 (11:00am - 12:30pm): Panel: Is Hybrid Programming a Bad Idea Whose Time has Come?

Panel Moderator: Pavan Balaji, Argonne National Laboratory [panel opening slides]

     Panelists:

        Taisuke Boku, Tsukuba University, Japan [slides]

        Allen Malony, University of Oregon [slides]

        Bronis de Supinski, Lawrence Livermore National Laboratory [slides]

        Vinod Tipparaju, Oak Ridge National Laboratory [slides]

        Vijay Saraswat, IBM Research [slides]


Session 3 (1:30pm - 3:30pm): Programming Models and Performance Evaluation

Session Chair: Hui Jin, Illinois Institute of Technology
  • "Performance Modeling for AMD GPUs", Ryan Taylor and Xiaoming Li [slides]
  • "A Hybrid Programming Model for Compressible Gas Dynamics using OpenCL", Ben Bergen, Marcus Daniels and Paul Weber
  • "Message Driven Programming with S-Net: Methodology and Performance", Frank Penczek, Sven-Bodo Scholz, Alex Shafarenko, Chun-Yi Chen, Nader Bagherzadeh, Clemens Grelck and JungSook Yang [slides]
  • "Implementation and Performance Evaluation of XcalableMP: A Parallel Programming Language for Distributed Memory Systems", Jinpil Lee and Mitsuhisa Sato [slides]

Session 4 (4:00pm - 5:30pm): Scheduling and Cache Management

Session Chair: Weikuan Yu, Auburn University

  • "Scheduling a ~100,000 core Supercomputer for maximum utilization and capability", Phil Andrews, Patricia Kovatch, Victor Hazlewood and Troy Baer [slides]
  • "Improving the Effectiveness of Context-based Prefetching with Multi-order Analysis", Yong Chen, Huaiyu Zhu, Hui Jin and Xian-He Sun [slides]
  • "Hierarchical Load Balancing for Large Scale Supercomputers", Gengbin Zheng, Esteban Meneses, Abhinav Bhatele and Laxmikant V. Kale [slides]