Past and Upcoming Talks and Demonstations by the CIFTS team

List of talks, posters, and demonstrations under the CIFTS umbrella. These publications either relate to CIFTS, FTB or to the fault tolerance research being done with different HPC software in order to eventually integrate them in CIFTS in a seamless manner.


  • Poster on Scalable Distributed Consensus to Support MPI Fault Tolerance, D. Buntinas, at the 18th EuroMPI Conference, Sept 2011
  • Poster on Run-Through Stabilization: An MPI Proposal for Process Fault Tolerance, J. Hursey, R. Graham, G. Bronevetsky, D. Buntinas, H. Pritchard and D. Solt, at the 18th EuroMPI Conference, Sept 2011
  • Talk on Realization of User-Level Fault Tolerance Policy Management through a Holistic Approach for Fault Correlation, H. Park, at the IEEE International Symposium on Policies for Distributed Systems and Networks (POLICY), June 2011
  • Talk on Berkeley Lab's Checkpoint/Restart (BLCR), E. Roman, at the Discovery 2011: HPC and Cloud Computing Workshop, June 2011
  • Poster on Strategies for Fault Tolerance in Multicomponent Applications, David Bernholdt, at the International Conference on Computational Science (ICCS 2011), June 2011
  • Technical session on User Application Monitoring through Assessment of Abnormal Behavior Recorded in RAS Logs, Byung H. Park, Thomas J. Naughton, Al Geist, Raghul Gunasekaran, David Dillow and Galen Shipman, at the Cray Users Group Conference, May 2011
  • Technical session on Real-Time System Log Monitoring/Analytics Framework, Raghul Gunasekaran, Byung H. Park, David Dillow, Galen Shipman and Al Geist at the Cray Users Group Conference, May 2011
  • DEMO VIDEO: SC'2010 showcased new capabilities and features of Fault Tolerant software and how coordination frameworks like CIFTS can help in end-to-end fault management. We showcased all popular implementations of MPI (which included FTB-enabled MPICH2, MVAPICH2 with migration and checkpointing support and Open MPI), the FTB-enabled FT library, and several applications.
    Download the M4V and MOV version of the application demostration. The demo showcases an FTB-enabled molecular dynamic application in a user-specified fault tolerant policy framework.