CIFTS: Coordinated Infrastructure for Fault-Tolerant Systems

CIFTS: Coordinated Infrastructure for Fault-Tolerant Systems

Current systems software components for large-scale machines remain largely independent in their fault awareness and notification strategies. Via the CIFTS initiative, MCS aims to provide a coordinated infrastructure that will enable fault-tolerant systems to adapt to faults occurring in the operating environment in a holistic manner.

The MCS approach will be to

  • Design a reference implementation of a fault awareness and notification backplane to provide common uniform event handling and notification mechanisms for fault-aware libraries and middleware;
  • Create an interface specification that allows libraries, run-time systems, and applications to connect to and use the fault-tolerance backplane; and
  • Extend key libraries and applications to validate interface choices and to form the critical mass necessary for adoption by the community.