The Fifth International Workshop on
Accelerators and Hybrid Exascale Systems (AsHES)

Join us on May 25th, 2015 in Hyderabad, India
To be held in conjunction with
IPDPS 2015: IEEE International Parallel and Distributed Processing Symposium

Opening remarks

8:30 - 8:45 am
Satoshi Matsuoka

Keynote by Michela Taufer -- The Numerical Reproducibility Fair Trade: Facing the Concurrency Challenges at the Extreme Scale

8:45 - 9:50 am

Abstract: Trends in execution concurrency on accelerated platforms make a compelling case for developing methods that can automatically and efficiently model and mitigate numerical irreproducibility beyond petascale and into exascale. High-performance accelerated computers at the extreme scale exhibit an enormous level of concurrency—a factor of 10,000 greater than on traditional platforms—that is moving computer simulations from bulk-synchronous executions to nondeterministic multithreading calculations and asynchronous I/O. As concurrency levels in simulations increase, the impact of rounding errors on numerical reproducibility is also exacerbated, ultimately affecting the ability of scientific simulations to reproduce program executions and numerical results. Under these circumstances, irreproducible results may not be trusted by a scientific community expecting reproducible behaviors; and any attempt to pursue reproducibility may come at a cost in performance that is too high.
In this talk we discuss the impact of rounding errors on result reproducibility when concurrent executions burst and workflow determinism vanishes on cutting-edge accelerated platforms. We unveil the power of mathematical methods to model rounding errors in scientific applications and discuss how these methods can mitigate error drifting on new generations of accelerators. Specifically, we focus on floating-point error accumulations for global summations for which any reduction order is too expensive or even impossible to enforce at the extreme scale from run to run. We model summations as reduction trees and identify those parameters that can be used to estimate the reduction's sensitivity to variability in a reduction tree. We assess the impact of these parameters on the ability of different reduction methods based on compensated summation (e.g., composite-precision summation) and “distillation" algorithms (e.g., prerounding) to mitigate errors. Our results illustrate the pressing need for intelligent runtime selection of reduction operators that ensure a given degree of reproducibility.

Bio: Michela Taufer is the David L. and Beverly J.C. Mills Chair of Computer and Information Sciences and an associate professor in the same department at the University of Delaware. She earned her master’s degrees in Computer Engineering from the University of Padova (Italy) and her doctoral degree in Computer Science from the Swiss Federal Institute of Technology (Switzerland). From 2003 to 2004 she was a La Jolla Interfaces in Science Training Program (LJIS) Postdoctoral Fellow at the University of California San Diego (UCSD) and The Scripps Research Institute (TSRI), where she worked on interdisciplinary projects in computer systems and computational chemistry. From 2005 to 2007, she was an Assistant Professor at the Computer Science Department of the University of Texas at El Paso (UTEP). She joined the University of Delaware in 2007 as an Assistant Professor and was promoted to Associate Professor with tenure in 2012.
Taufer's research interests include scientific applications and their advanced programmability in heterogeneous computing (i.e., multi-core and many-core platforms, GPUs); performance analysis, modeling, and optimization of multi-scale applications on heterogeneous computing, cloud computing, and volunteer computing; numerical reproducibility and stability of large-scale simulations on multi-core platforms; big data analytics and MapReduce.

Break 9:50 - 10:30 am

Session 1: Accelerating Analytics

10:30 am - 12:00 pm
Chair: Sriram Krishnamoorthy

Towards A Combined Grouping and Aggregation Algorithm for Fast Query Processing in Columnar Databases with GPUs
Sina Meraji, Sunil Kamath, John Keenleyside and Bob Blainey [ slides ]
Implementation of CG Method on GPU Cluster with Proprietary Interconnect TCA for GPU Direct Communication
Kazuya Matsumoto, Toshihiro Hanawa, Yuetsu Kodama, Hisafumi Fujii and Taisuke Boku [ slides ]

Lunch 12:00 - 1:30 pm

Session 2: Algorithm Design for Heterogeneous Systems

1:30 - 3:30 pm
Chair: Min Si

GPGPU-based Parallel R-tree Construction and Querying
Sushil K. Prasad, Michael McDermott, Xi He and Satish Puri
Fast Burrows Wheeler Compression Using All-Cores
Aditya Deshpande and P J Narayanan [ slides ]
A Novel Heterogeneous Algorithm for Multiplying Scale-Free Sparse Matrices
Kiran Raj Ramamoorthy, Dip Sankar Banerjee, Kannan Srinathan and Kishore Kothapalli [ slides ]
GraphReduce: Large-Scale Graph Analytics on Accelerator-Based HPC Systems
Dipanjan Sengupta, Kapil Agarwal, Shuaiwen Song and Karsten Schwan [ slides ]
Graph Coloring on the GPU and Some Techniques to Improve Load Imbalance
Shuai Che, Gregory Rodgers, Brad Beckmann and Steve Reinhardt [ slides ]

Break 3:30 - 4:00 pm

home : organizers : call for papers : registration : program : submission : contact us

Copyright © ASHES. All rights reserved.