CIFTS: Coordinated Infrastructure for Fault-Tolerant Systems
Current systems software components for large-scale machines remain largely independent in their fault awareness and notification strategies. Via the CIFTS initiative, MCS aims to provide a coordinated infrastructure that will enable fault-tolerant systems to adapt to faults occurring in the operating environment in a holistic manner.
The MCS approach will be to
- Design a reference implementation of a fault awareness and notification backplane to provide common uniform event handling and notification mechanisms for fault-aware libraries and middleware;
- Create an interface specification that allows libraries, run-time systems, and applications to connect to and use the fault-tolerance backplane; and
- Extend key libraries and applications to validate interface choices and to form the critical mass necessary for adoption by the community.