E. Lusk, N. Desai, R. Bradshaw, A. Lusk, R. Butler, "An Interoperability Approach to System Software, Tools, and Libraries for Clusters," International Journal of High Performance Computing Applications, vol. 20, no. 3, 1969, pp. 401-407, . [pdf]
Systems software for clusters typically derives from a multiplicity of sources: the kernel itself, software associated with a particular distribution, site-specific purchased or open-source software, and assorted home-grown tools and procedures that attempt glue everything together to meet the needs of the users and administrators of a particular cluster. Whether a cluster is a general-purpose resource serving multiple users or dedicated to a single application, getting everything to work together is a challenge. The challenge is partially met by special software distributions for clusters such as OSCAR or ROCKS. Here we discuss another approach (although it is not inconsistent with existing distributions), in which a small number of concepts are deployed to facilitate the customized integration of various software tools for cluster management, operation, and user jobs. The concepts include (1) a component approach to basic system software such as schedulers, queue managers, process managers, and monitors; (2) a software development kit for constructing networks of system software components, either from scratch or by wrapping ``foreign" software; and (3) the use of explicit parallelism in building system tools for high performance. We illustrate this approach with a description of a mid-sized general-purpose cluster operated entirely by software built this way.