Cumulus: An Open Source Storage Cloud for Science
|Title||Cumulus: An Open Source Storage Cloud for Science|
|Publication Type||Conference Proceedings|
|Year of Publication||2011|
|Authors||Bresnahan, J, LaBissoniere, D, Freeman, T, Keahey, K|
|Conference Name||2nd Workshop on Scientific Cloud Computing (ScienceCloud 2011)|
|Conference Location||San Jose, CA|
Amazon’s S3 protocol has emerged as the de facto interface for storage in the commercial data cloud. However, it is closed source and unavailable to the numerous science data centers all over the country. Just as Amazon’s Simple Storage Service (S3) provides reliable data cloud access to commercial users, scientific data centers must provide their users with a similar level of service. Ideally scientific data centers could allow the use of the same clients and protocols that have proven effective to Amazon’s users. But how well does the S3 REST interface compare with the data cloud transfer services used in today’s computational centers? Does it have the features needed to support the scientific community? If not, can it be extended to include these features without loss of compatibility? Can it scale and distribute resources equally when presented with common scientific the usage patterns?
We address these questions by presenting Cumulus, an open source implementation of the Amazon S3 REST API. It is packaged with the Nimbus IaaS toolkit and provides scalable and reliable access to scientific data. Its performance compares favorably with that of GridFTP and SCP, and we have added features necessary to support the econometrics important to the scientific community.