Cumulus: An Open Source Storage Cloud for Science

TitleCumulus: An Open Source Storage Cloud for Science
Publication TypeConference Paper
Year of Publication2011
AuthorsBresnahan, J, LaBissoniere, D, Freeman, T, Keahey, K
Conference Name2nd Workshop on Scientific Cloud Computing (ScienceCloud 2011)
Conference LocationSan Jose, CA
Other NumbersANL/MCS-P1854-0311
AbstractAmazon’s S3 protocol has emerged as the de facto interface for storage in the commercial data cloud. However, it is closed source and unavailable to the numerous science data centers all over the country. Just as Amazon's Simple Storage Service (S3) provides reliable data cloud access to commercial users, scientific data centers must provide their users with a similar level of service. Ideally scientific data centers could allow the use of the same clients and protocols that have proven effective to Amazon's users. But how well does the S3 REST interface compare with the data cloud transfer services used in today's computational centers? Does it have the features needed to support the scientific community? If not, can it be extended to include these features without loss of compatibility? Can it scale and distribute resources equally when presented with common scientific the usage patterns? We address these questions by presenting Cumulus, an open source implementation of the Amazon S3 REST API. It is packaged with the Nimbus IaaS toolkit and provides scalable and reliable access to scientific data. Its performance compares favorably with that of GridFTP and SCP, and we have added features necessary to support the econometrics important to the scientific community.