Argonne National Laboratory

Porting the MG-RAST Metagenomic Data Analysis Pipeline to the Cloud

TitlePorting the MG-RAST Metagenomic Data Analysis Pipeline to the Cloud
Publication TypeJournal Article
Year of Publication2011
AuthorsWilke, A, Wilkening, J, Glass, EM, Desai, NL, Meyer, F
JournalConcurrency and Computation: Practice and Experience
Date Published12/2011
Other NumbersANL/MCS-P1894-0411

Computational biology applications typically favor a local, cluster-based, integrated computational platform. We present a lessons learned report for scaling up a metagenomics application that had outgrown the available local cluster hardware. In our example, removing a number of assumptions linked to tight integration allowed us to expand beyond one administrative domain, increase the number and type of machines available for the application, and improve the scaling properties of the application. The assumptions made in designing the computational client make it well suited for deployment as a virtual machine inside a cloud. This paper discusses the decision process and describes the suitability of deploying various bioinformatics computations to distributed heterogeneous machines.