Publications
A. Wilke, J. Wilkening, E. M. Glass, N. L. Desai, F. Meyer, "Porting the MG-RAST Metagenomic Data Analysis Pipeline to the Cloud," Concurrency and Computation: Practice & Experience, vol. 23, no. 17, 2012, pp. 2250-2257. Also Preprint ANL/MCS-P2050-0212, February 2012. [pdf]
Computational biology applications typically favor a local, cluster based, integrated computational platform. We present a lessons learned report for scaling up a metagenomics application that had outgrown the available local cluster hardware. In our example, removing a number of assumptions linked to tight integration allowed us to expand beyond one administrative domain, increase the number and type of machines available for the application, and improve the scaling properties of the application. The assumptions made in designing the computational client make it well suited for deployment as a virtual machine inside a cloud. This paper discusses the decision process and describes the suitability of deploying various bioinformatics computations to distributed heterogeneous machines.
