Scalable Cluster Administration - Chiba City I Approach and Lessons Learned

TitleScalable Cluster Administration - Chiba City I Approach and Lessons Learned
Publication TypeReport
Year of Publication2002
AuthorsNavarro, JP, Evard, R, Nurmi, D, Desai, NL
Date Published06/2002
Other NumbersANL/MCS-P969-0602
Abstract

<p>Systems administrators of large clusters often need to perform the same administrative activity hundreds or thousands of times. Often such activities are time-consuming, especially the tasks of installing and maintaining software. By combining network services such as DHCP, TFTP, FTP, HTTP, and NFS with remote hardware control, cluster administrators can automate all administrative tasks. Scalable clustger administration addresses the following challenge: What systems design techniques can cluster builders use to automate cluster administration on very large clusters? We describe the approach used in the Mathematics and Computer Science Division of Argonne National Laboratory on Chiba City I, a 314-node Linux cluster; and we analyze the scalability, flexibility, and reliability benefits and limitations from that approach.</p>

PDFhttp://www.mcs.anl.gov/papers/P969.pdf