Porting the MG-RAST Metagenomic Data Analysis Pipeline to the Cloud
|Title||Porting the MG-RAST Metagenomic Data Analysis Pipeline to the Cloud|
|Publication Type||Journal Article|
|Year of Publication||2011|
|Authors||Wilke, A, Wilkening, J, Glass, EM, Desai, NL, Meyer, F|
|Journal||Concurrency and Computation: Practice and Experience|
Computational biology applications typically favor a local, cluster-based, integrated computational platform. We present a lessons learned report for scaling up a metagenomics application that had outgrown the available local cluster hardware. In our example, removing a number of assumptions linked to tight integration allowed us to expand beyond one administrative domain, increase the number and type of machines available for the application, and improve the scaling properties of the application. The assumptions made in designing the computational client make it well suited for deployment as a virtual machine inside a cloud. This paper discusses the decision process and describes the suitability of deploying various bioinformatics computations to distributed heterogeneous machines.