Workshop Program
An Overview of Fault-Tolerant Techniques for High Performance Computing [slides]
Prof. Yves Robert, ENS Lyon Institut Universitaire de France and University of Tennessee, Knoxville
Abstract:
Resilience is a critical issue for large-scale platforms. This talk will provide a survey on fault-tolerant techniques for high-performance computing:
- Overview of failure types and typical probability distributions
- Application-specific techniques, such as ABFT
- General-purpose techniques, including several checkpoint and rollback recovery protocols, possibly combined with replication
- Relevant execution scenarios, evaluated and compared through quantitative models
Yves Robert received the PhD degree from Institut National Polytechnique de Grenoble. He is currently a full professor in the Computer Science Laboratory LIP at ENS Lyon. He is the author of 5 books, 120 papers published in international journals, and 180 papers published in international conferences. He is the editor of 10 book proceedings and 12 journal special issues. He is the advisor of 25 PhD theses. His main research interests are scheduling techniques and resilient algorithms for multicore processors, clusters and grids. Yves Robert served on many editorial boards, including IEEE TPDS. He was the program chair of HiPC'2006 in Bangalore, of IPDPS'2008 in Miami and of ISPDC'2009 in Lisbon. He will be the program co-chair of ICPP'2013 and program chair of HiPC'2013. He is a Fellow of the IEEE. He has been elected a Senior Member of Institut Universitaire de France in 2007 and renewed in 2012.
"Flexible Approach to Staged Events" , Tiago Salmito, Ana Lucia de Moura, and Noemi Rodriguez. [slides]"ConMR: Concurrent MapReduce Programming Model for Large Scale Shared-Data Applications" , Fan Zhang, Qutaibah Malluhi, and Tamer Elsayed"Read-Write Lock Allocation in Software Transactional Memory" , Amir Ghanbari Bavarsad and Ehsan Atoofian. [slides]-
"A Heterogeneous Computing framework for Computational Finance" , Gordon Inggs, David Thomas and Wayne Luk. [slides]
"A Framework for Performance-Aware Composition of Applications for GPU- based System" , Usman Dastgeer, and Christoph Kessler. [slides]"Exploiting Execution Order and Parallelism from Processing Flow Applying Pipeline-based Programming Method on Manycore Accelerators" , Shinichi Yamagiwa, Ryo Jozaki, Shixun Shang, Ryo Zaizen, and Dewen Xu. [slides]"Performance Tuning Is Demanding on Multicore and Manycore Systems: A Case Study on Large Scale Feature Matching within Image Collections" , Xiaoxin Tang, Steven Mills, David Eyers, Zhiyi Huang, Kai-Cheung Leung and Minyi Guo. [slides]"X-KAAPI: a Multi Paradigm Runtime for Multicore Architectures" , Thierry Gautier, Fabien Lementec, Vincent Faucher and Bruno Raffin. [slides]
"Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi" , Arunmoezhi Ramachandran, Jerome Vienne, Rob Van Der Wijngaart, Lars Koesterke, and Ilya Sharapov. [slides]"Tiled QR Decomposition and Its Optimization on CPU and GPU Computing System" , Dongjin Kim, Kyu-ho Park. [slides]"Hierarchical Parallel Matrix Multiplication on Large-Scale Distributed Memory Platform" , Jean-Noel Quintin, Khalid Hasanov, and Alexey Lastovetsky. [slides]
Copyright (C): Pavan Balaji, Argonne National Laboratory